Skip to main content
Machine LearningNewsRelease Notes

January ’21 Release: Custom Objects, Sync to Salesforce

By January 20, 2021October 20th, 2022No Comments

When we released custom matching rules based on machine learning (ML) in November, it paved the way for the next major feature frequently requested: the ability to detect duplicates and merge custom Salesforce objects. Another major feature added to DataGroomr is the ability to export duplicate groups identified by ML algorithms to Salesforce as duplicate records sets. Let’s take a look at these new features in more detail.

Custom objects

Before this release, DataGroomr could only work with Leads, Contacts, and Accounts. Now, the Matching Rules page displays a couple of new buttons: Add Object and Clear Cache.

Clicking on Add Object opens up a dialog where you can select a Salesforce object you want to dedupe.

Let’s go ahead and add an Opportunity object then click on the opportunity, Add Model and add fields for the machine learning model to train on. Click Train.

And then train a model by answering the question, “Are these records the same entity?”

The model accuracy improves with the number of questions answered. In order to calculate the recommended number of answers, we take the number of fields and multiply by 5. Our recommendation to the number of answers is displayed at the bottom of the training dialog. As soon as we fill the progress line, we may click the Finish button and generate the model that we can use to detect duplicates.

When our model is trained, we want to set it as Default for the organization so that when we create an Opportunity dataset, it can be used to detect duplicates. It concludes the process of setting up a custom object and now we can work with it the same way we used to work with the standard objects. Let’s go back to Trimmr and add a new dataset for the Opportunity object.


When DataGroomr finishes its analysis, we can review our duplicate group and merge records or we can use the mass merge function.

Using our proprietary logic, upon merge, DataGroomr will reparent all related records, such as Notes and Attachments, from the records being deleted over to the surviving master record.

The Clear Cache button is useful when object metadata has been modified in Salesforce, fields have been deleted or updated or new fields have been added. Clear Cache will reload metadata from Salesforce.

Export duplicates to Salesforce

We have enhanced export records functionality, allowing you to write duplicate record groups as Salesforce Duplicate Record Sets similarly to what Salesforce duplicate job does.

What it does is opens up a possibility to take advantage of all duplicate records management features Salesforce has to offer, including Duplicate Record Reports and native Compare and Merge.

Sync to Salesforce button displays a dialog where you can specify desired Match confidence. Records with a match confidence equal to or higher than the selected value will be exported to Duplicate Record Sets in the specified Salesforce duplicate rule.

Extended mass merge audit

Auditr has been extended to display detailed information about mass merge, including IDs of the merged records and updated fields. Expanding Mass Merge event in Auditr displays a list of records allowing you to drill down to see details.

What’s Coming in the Next Release?

These are the highlights of the January update. A full list of updates, improvements, and fixes is available here.

Here’s what you can expect next in DataGroomr:

  • Ability to manage users and sandbox organizations
  • Various user-requested improvements

Make sure to let us know which features you’d like to see next or vote on the Ideas Exchange portal.

As always, we would like to remind you that we offer a free 14-day trial of DataGroomr. Simply click the ‘free trial’ button in the top right-hand corner and log in using your Salesforce credentials. There is no setup required and you can get a handle on the duplicate management of your data right away. 

Happy DataGrooming, Trailblazers!

Ben Novoselsky

Ben Novoselsky, DataGroomr CTO, is a hands-on software architect involved in the design and implementation of distributed systems, with over 19 years of experience. He is the author of multiple publications about the design of the distributed databases. Ben holds a Ph.D. in Computer Science from St. Petersburg State University.