I have the following table
Address, City, Data1, Data2, Data3
123 North 5th Street, San Francisco, A, B, C
123 N 5th Street, San Francisco, [Blank], D, [Blank]
123 North 5th St, San Francisco, E, F, G
I want to merge the data based on 2 criteria: the first 4 digits of the address and the city.
So the merge row would look like:
123 North 5th Street, San Francisco, AE, BDF, CG
I have about 6000 records include the "duplicates". I have the table in both access and excel, any help would be appreciated.
Are you sure you want to carry out the matching on that criteia? Would you for instance want to match the following record with those three above 123 North 4th Street, San Francisco?
The ideal way to do this is to address standardise the data first and then deduplicate it. In NZ for instance we use a PAF file (Postal Address File) to standardise the addressing and accurately issue a DPID (delivery point identifier) to each record. You'd then be in a position to match the data to carry out your second step of merging records (which is still a tricky exercise). There would be plenty of vendors around to facilitate this for you in the US for a small fee, Axciom I believe is a global player in this space.
If you don't want to do that, another option is to use a third party tool to match those records with some fuzzy logic rather than coding it yourself, I've used an addin to Excel before by a company called DQGlobal to run over data and match records.