Skip to main content
Data CleansingData ManagementMachine Learning

How to Identify and Fix the Causes of Bad Data

By April 28, 2021May 5th, 2021No Comments

In 2017, research from CIO magazine found that about a third of CRM projects fail, while more recent research found the failure rate to be much higher, between 50-70%. In this context, CRM failure means that the project resulted in losses or did not improve the company’s performance. One of the most popular reasons companies cite for such failures is low-data quality, which turns your CRM into a burden, rather than an asset. In this article, we will discuss some of the main culprits of bad data, how they get into your CRM and what you can do to fix these problems. Let’s start by taking a look at the first one.  

1. Duplicate Data 

Duplicate data is costing you money in more ways than you can imagine. It can drain your marketing budget, lose sales opportunities and create overall confusion with the information that you have. Many CRMs, like Salesforce, will provide some deduplication functionality out-of-the-box, but it will be limited to the basics. This is why most companies turn to third-party applications like DataGroomr, to resolve tough deduplication challenges. What separates DataGroomr from other deduplication apps is that it employs machine learning to fight the bad data, as opposed to the manual rule-creation methods used by other products. This means that there is no complicated setup process.  DataGroomr’s machine learning models also leverage active learning to constantly improve the results, rather than having to manually tinker with rules. 

2. Missing Data 

When a contact is missing critical data, such as an email or phone number, the entire record has questionable utility. Even when less critical data is missing, opportunities to personalize and better understand the prospect are too often lost. It should be noted that most problems related to missing data can be tied to human error as opposed to your CRM. In order to fix these problems, expectations for which data needs to be included should be set with everyone involved in gathering this information. Data aggregators such as Dunn & Bradstreet or ZoomInfo can also be tapped to enrich data and fill in some of those missing fields. 

3. Dummy Data

A lot of times when customers fill out web forms, they are reluctant to enter in their phone number or other personal information. If the fields are required, users try to bypass these restriction by inputting dummy data such as (000)000-0000 for the phone number fieldInternal staff may also be guilty of entering dummy data into the CRM. For example, it’s pretty common to see street addresses like 123 Fake Street just so the record could be saved in the CRM. It’s pretty impractical to manually find this type of dummy data.  This is why DataGroomr is using its artificial intelligence engine to discover and isolate these values in your CRM.  

4. Wrong Field

It is very common for certain data to end up in the wrong CRM field. This could be the result of an uploading issue or simply someone entering the details into the wrong place. One of the ways you can prevent manual errors is by limiting free-form text fields in your CRM. For example, Phone Number and Zip Code fields should be configured to require a specific number of numeric characters. That way, if someone tries to enter a zip code into a phone field, they will receive an error.   

For imports, issues tend to arise when columns are mapped to incorrect fields. For example, you may have acquired a database of customer addresses arranged in the following order:  

Title, Last Name, First Name, Address 1, Address 2, City, State, Zip Code 

Your CRM has fields for:  

Title, Last Name, First Name, Address 1, Address 2, Address 3, Zip Code 

In this example, the CRM is expecting the City and State to be included on one field. Even if the data is imported successfully, it will only result in causing confusion for users.  

One way to address problems like the previous example is by taking advantage of the available automations for data control. Most CRMs allow customization down to the field level.  Administrators should take advantage of this by designing systems that police themselves. This includes things like using default values and auto-population as much as possible, creating helpful field dependencies, designing access controls to only give account/contact/lead permissions to the right users and other protective measures.  

While automation is the low-hanging fruit, in the long term you will need to make changes to your data management process that will ensure that the data you have is accurate and timely. Let’s explore some of these options further…  

Take a Comprehensive Approach to Get Rid of Your Bad Data 

It is entirely possible to clean up your data and optimize your data management processes. However, this is a time-consuming process that will require buy-in from key members of the organization and a significant investment in resources, but it will pay big dividends in the long run. We did a deep dive into this subject in earlier posts. For more information, we recommend reading our series about conducting a data quality assessment. For now, let’s take a look at some key steps you can start taking today towards increasing the health of your data:  

1. Identify the Sources of Low-Quality Data 

By identifying where this data is originating, you will be able to adjust your processes to minimize its impact. Some common sources of low-quality data include:  

  • Online forms which allow duplicates 
  • Unclear fields, where users tend to input incorrect information  
  • Manual data entry errors 
  • Unclear data standards and formatting 

2. Prevent the Collection of Bad Data at the Source

Implement strict data normalization processes and formatting across all your organization. Ensure that everybody understands precisely how to input data and adheres to these policies. Also, make any fields in online forms clear and focused to encourage users to enter information correctly. 

3. Cleanse Your Data on a Regular Basis  

In real-life scenarios, preventing all bad data from entering your systems is not possible. So, no matter what you do, data will never really be perfect. However, by cleansing low-quality data on a regular basis you will be able to reduce the number of inaccuracies in your information. This is where tools become essential.  

Start Taking Action Right Away 

The low-quality data issues you are dealing with will not go away by themselves and the sooner you start addressing this problem, the better. A good place to start is by getting an overview of the health of your data. DataGroomr provides you with an instant data quality analysis that tells you how many duplicates you have in your Salesforce. You can then merge the duplicates you have and set-up monitoring to continuously detect and eliminate any new duplicates.  

Try DataGroomr today with our free 14-day trial.  

Steven Pogrebivsky

Steve Pogrebivsky has founded multiple successful startups and is an expert in data and content management systems with over 25 years of experience. Previously, he co-founded and was the CEO of MetaVis Technologies, which built tools for Microsoft Office 365, Salesforce and other cloud-based information systems. MetaVis was acquired by Metalogix in 2015. Before MetaVis, Steve founded several other technology companies, including Stelex Corporation which provided compliance and technical solutions to FDA-regulated organizations.