Synchrosoft Limited
 
Painless and cost-effective solutions for data quality problems
   
 
Make your data whiter than white
 

Data Cleansing Services


  Data Auditing and Data Quality Assessment

Good quality data is consistent, reliable and fit for purpose.

Reasons why data may not be fit for purpose:

 - duplicate or "near duplicate" information
 - information missing
 - incorrect information
 - inconsistent information
 - conflicting information
 - out of date information

Auditing data

Synchrosoft provides a formal methodology for auditing data. This involves checking each field in every dataset and identifying and reporting anomalies.

Using our IOTrak software we can define the rules required to analyse a single dataset or multiple datasets of any complexity. The output from the analysis process provides a complete picture of the quality of the data and detailed lists of anomalies found.

Reasons to audit your data:

 - It enables you to make judgements about "fitness for purpose".
 - It provides detailed statistics to assist you with risk analysis.
 - It helps you to assess levels of compliance with standards, data quality
    regulations and legal constraints.
 - It provides evidence of "due diligence" relating to data quality issues.
 - It enables you to measure the quality gap in your data and provides statistics
    to help you assess the cost of improving the data quality.
 - It provides the opportunity for bad data that cannot be corrected automatically
    to be referred back to its originators for correction.


Data Transformation

When transferring data from one system to another, some degree of transformation is usually required. This may include converting codes, reformatting dates, standardising names, addresses and telephone numbers, merging records and correcting errors and omissions.

IOTrak can be customised to do all of these things. Address standardisation can be carried out using Ordnance Survey AddressPoint, the Post Office Address File (PAF), the National Land and Property Gazetteer (NLPG) or a reference list of addresses you provide yourself.

The principles involved are:

 - "Untouched by human hand" - all transformation processes are automatic and
    manipulation using Microsft Excel or Microsft Access, for example, is removed.
 - Original data is left unchanged and the transformed data is appended to each
    record along with a quality code confirming the transformation process applied.
 - All input records are accounted for.
 - All fields used are quality checked and each is given a positive or negative quality
    code to confirm this.
 - Unacceptable data can be diverted into exception files.

Extracts from the processed data can be taken for input to other systems.


Data Matching and Deduplication

IOTrak also has fully customisable facilities for identifying duplicate or "near duplicate" records in datasets as well as being able to match data from multiple datasets to create combined results.

Data can be matched to any address list such as AddressPoint, PAF or the NLPG to obtain formatted addresses and map co-ordinates for input to a graphical information systems (GIS).

Commercial data sources such as the Thompson Directory or Yell can be used in various ways. An address list can be matched to them and each record geocoded and have standard industry codes attached for subsequent analysis.

Your CRM data or mailing list data can be matched to a commercial data source to check addresses and telephone numbers and pick up other information such as e-mail and web site addresses. Differences found can either be corrected automatically or routed to an exception file for manual verification or correction.


Free Text Conversion

Specialist features in IOTrak enables free-text to be scanned and pieces of useful information placed into formatted fields. This can be used in many different scenarios to extract information such as names, addresses, places, vehicles, descriptions and pathology test results, or any identifiable entities. IOTrak can also pick out numeric data and coded data such as map references and vehicle registrations.

This advanced methodology requires a detailed word and phrase analysis to be carried out first by a data analyst. Related groups and hierarchies of words or phrases as well as "context link" words can be defined in lists. The analyst can then specify how they should categorised and interpretted into formatted information.


© Copyright Synchrosoft Limited, 2006.

Home Page Data quality, cleaning, matching and conversion services Technical Authoring and  Software Testing Services IOTrak Process Automation Software Contact Synchrosoft for further information