Machine LearningRelease Notes

November ’20 Release: What’s New in DataGroomr

By December 7, 2020No Comments

A few major features have been added to the DataGroomr app in November, including an ability to create and train your own machine learning model; create and customize reports on the DataGroomr dashboard; and an automated method for resolving redundant Contact and Account relationships. Let’s take a look at these new features in more detail.

Create your own Matching Rules

DataGroomr is the only Salesforce deduplication solution that uses Machine Learning. The very first time that you log in, our pre-built algorithms analyze your data and identify duplicates. These machine learning models are designed and tuned by our experience of working with hundreds of thousands of records from real organizations. While this generic approach has proven to be highly effective, some of our customers have asked for more control over duplicate detection through a granular control of the fields and comparison criteria used in the models. We are delivering these controls in this release. 

You will notice that we have added a new section called Matching Rules under the Supervisr module. It displays all objects that are accessible through the DataGroomr platform.

You can click into any of these objects to view and manage associated matching rules.

Let’s demonstrate the process of creating a new model tailored to your organization.  Press the Add Model button to begin.

The Load… button can be used to populate a model with the list of DataGroomr default fields.  If your Salesforce org contains matching rules (Salesforce matching rules), these may be loaded as well.

We can add more fields and change comparison types for each field (read more about adding and editing fields). The Auto training switch enables the model to be periodically retrained based on the user merge/unmatch actions.

During the next step, the model needs to be trained to detect duplicates using your data. Click Train > and you’ll be presented with a question “Are these records the same entity?” and three responses: Yes, No, or Not Sure.  Select your response and new data will be presented. The recommended number of answers that model needs for accurate duplicates detection is displayed in the i icon. It is calculated by multiplying the total number of model fields by five for both positive and negative answers.

When a model is trained, it can either be designated as a default using Toggle Default button or assigned to a dataset using the Assign button.

You can also select any matching rule in the Dataset configuration dialog in Trimmr.

Go ahead and give it a try, and let us know how duplicate detection works for you.

Customizable Dashboard

Our customers have been very happy with our deduplication tool but have asked for a better way to visualize the value that they are getting. While a few methods to create data reports have been available, it’s always more satisfying to get what you want with little or no effort. We have reworked the DataGroomr app home page to display information about data hygiene of your organization in simple charts consolidated in a customizable dashboard.

By default, we now display the Duplicate Summary and History widgets. These can be resized, moved, and configured to display different time frames and different datasets. The following widgets are currently available (using the Add widget button):

  • Duplicates Summary
  • Duplicates History
  • Merge/Convert History
  • Site Section
  • Link to Resource

Below is the quick overview of the dashboard and widgets from our expert Sarah.

Resolve Redundant Relationships to Solve Merge Problems

Sometimes you may need to merge two contacts that are indirectly related to the same account, or two accounts that are indirectly related to the same contact. Salesforce will not allow you to do this and you will get the following error message:

Can’t merge accounts. These accounts have the same related contact. Remove the redundant account-contact relationships and then try merging again.

If you are trying to merge duplicate contacts, the text of the error message will be the same except that it will say “Can’t merge contact” instead of accounts. Salesforce requires you to remove the duplicate relationship manually before it can merge the accounts or contacts.

DataGroomr will automatically resolve such redundant relationships upon merge. Read more about how DataGroomr merges Salesforce Records that have the same related accounts or contacts,

What’s Coming in the Next Release?

These are the highlights of the November update; a full list of updates, improvements and fixes is available here.

Here’s what you can expect next in DataGroomr:

  • Ability to define matching rules for custom Salesforce objects
  • Ability to merge custom Salesforce objects

Make sure to let us know which features you’d like to see next or vote on the Ideas Exchange portal.

As always, we would like to remind you that we offer a free 14-day trial of DataGroomr. Simply click the ‘free trial’ button in the top right-hand corner and log in using your Salesforce credentials. There is no setup required and you can get a handle on the duplicate management of your data right away. 

Happy DataGrooming, Trailblazers!

Ben Novoselsky

About Ben Novoselsky

Ben Novoselsky, DataGroomr Co-Founder, is a hands-on software architect involved in the design and implementation of distributed systems, with over 19 years of experience. He is the author of multiple publications about the design of the distributed databases. Ben holds a Ph.D. in Computer Science from St.Petersburg State University.