- What Salesforce feature do I use to import data?
- Am I able to import data myself or do I need an admin with privileges to do it for me?
- What format should the data be in?
- How do I tell Salesforce which data column populates which field?
- How do I populate null values into fields?
- How much data can I import at once?
- How long does an import take? Can I check its status in the meantime?
- How do I check the results?
- Can I undo an import if I don’t like the results?
- Is there an easier way to go about this?
We’ll answer all these questions in this blog post.
Importing data can be confusing because Salesforce provides multiple tools that can do it. Here are the two key tools:
- Data Import Wizard
- Data Loader
How are these tools different and which should be used when?
The Data Import Wizard is an in-browser Salesforce tool that guides the user through the import process. It’s designed for smaller data files with at most 50,000 rows to import or update. It can only import data into certain objects, however:
- Business accounts
- Person accounts
- Campaign members
- Custom objects
So why use it? First off, having it as an in-browser tool internal to Salesforce makes it easier to find and operate. Secondly, it has some useful features for handling duplicate records.
Duplicate Record Identification
The Data Import Wizard can identify rows in your import file that are duplicates of existing records in the destination object. It does so by matching one or more fields that you select to use for determining whether a row is a duplicate. The fields you can choose from for duplicate matching vary by object, but some examples are:
- Salesforce ID
- External ID
An example of using multiple fields for matching duplicates is the combination of the name and site fields for business accounts.
If no matching fields are chosen or a row is not matched, it’s assumed to be unique and a new record is created for it.
Duplicate Record Handling
There are a couple different behaviors the import wizard can take when it encounters a duplicate record:
- Prevent the record from being imported
- Update the existing record with new data from the row in the import file
If you choose to update the existing record, in most cases the wizard will not replace existing field values even if new values were provided for them. Instead, it will import new data only into fields that were previously empty. The exception is if you select the option to “Overwrite existing account values” when importing business accounts.
The Data Loader is a tool external to Salesforce that allows you to import far more records, up to five million of them at a time. It supports loading the data into any object, freeing you from the object constraints of the Data Import Wizard. And it can be scheduled to run recurring imports at regular intervals, for example nightly.
It’s a more complex tool than the Data Import Wizard, but also far more versatile. It can even be called from the command line for those ambitious admins who are comfortable with command-line execution.
The Data Loader isn’t limited to simply inserting or updating data. Here’s the list of operations it can perform:
- Upsert (inserts records that don’t yet exist and updates those that do; the data must include an ID column to match against either a Salesforce ID or an external ID for updates)
- Delete (into the Salesforce recycle bin)
- Hard delete (skips the recycle bin so deleted records become unrecoverable)
- Mass delete (delete records in large groups)
Data Types Supported
The data loader can import field values with any of the following data types:
Base64 is actually a path to a file. The file can be an attachment or Salesforce CRM Content (a document or a link to a document in a library).
Since the Data Loader is an external tool, it communicates with Salesforce through its APIs. There are two choices for which API to have the Data Loader use:
- SOAP API
- Bulk Data API
Each has its advantages. The Bulk Data API operates asynchronously, so it can move faster by performing multiple steps in parallel. However, parallel execution can create what’s called “database contention,” which means that multiple concurrent processes are trying to change the same record at the same time. Sometimes this will cause the import process to fail. It’s not necessarily common, but you can avoid the risk entirely by using the SOAP API, which executes all its steps sequentially. Of course, the SOAP API is slower as a result.
Results from the Data Loader process are broken into three files:
- Successes file
- Errors file
- Log file
The log file is a list of log entries in chronological order. Procedural events like logging in or out of Salesforce are marked as “INFO.” Errors such as a record missing a required field are marked as “ERROR.”
Not all users can import the data they’d like. It depends on the privileges they have and what they’re trying to do. For background on how permissions work in Salesforce, check out my previous blog entry on Understanding Salesforce Permissions.
You’ll notice that some of the actions listed below require the “Modify all data” permission. This is one of the most restricted permissions in Salesforce because it’s so powerful (i.e. it can really screw things up if you’re not careful). Usually only admin leaders hold that permission.
When importing data, you are creating new records and filling in fields, or you are updating values in fields of existing records. In either case, you need access to the fields you’d like to modify. That access depends on having the permission to edit the field. It also requires the permission to read the field, but if you have the permission to edit it, you automatically have the permission to read it.
The records you’d like to create, update, or delete exist in a Salesforce object (the Salesforce term for a data table). Let’s go over which object and user permissions you’ll need for operations in each import tool.
Data Import Wizard
The permissions needed to import data with the Data Import Wizard of course depends on the object you’d like to import to, but it can also depend on the owner of the data. Here’s how it breaks down:
If the accounts are owned by you, you’ll need the “Import personal contacts” permission. If they are owned by others, you’ll need the “Modify all data” permission
If the accounts are owned by you, you’ll need these three permissions:
- Create records in the account object
- Edit records in the account object
- Import personal contacts
If the accounts are instead owned by others, you’ll need these four permissions:
- Create records in the account object
- Edit records in the account object
- Edit records in the contact object
- Modify all data
For leads, you’ll just need the “Import leads” permission.
To import campaign members, the permissions depend highly on what’s being imported:
- Campaign member statuses
- Existing contacts
- Existing leads
- Existing person accounts
- New contacts
- New leads
You’ll need these three permissions to import custom objects:
- Import custom objects
- Create records in the custom object
- Edit records in the custom object
To import records into the solutions object, you’ll just need the “Import solutions” permission.
Here are the privileges required for each type of operation on the Data Loader:
- Insert: permission to create records in the object
- Update: permission to edit records in the object
- Upsert: permission to create and edit records in the object
- Delete: permission to delete records in the object
- Hard delete: permission to delete records in the object
- Mass delete: permission to “Modify all data”
Data should be put in comma-separated value (CSV) format. A CSV file is a text file where each column in a row is separated by a comma. When using the Data Loader, you can also change the settings to expect a tab-separated format instead, which is helpful if you expect commas to appear in the values. Spreadsheet applications like Excel can read and output .csv format files, so they are easy to generate.
To import contacts, you can also use files generated by contact management applications like ACT!, LinkedIn, and Outlook. The data column labels won’t match up exactly to fields in the Salesforce contacts object, but Salesforce documentation has a guide for how to run the export and how to label the fields. You can reference it here.
There are some rules regarding the formatting of values in import files. They vary by data type as listed here.
Reference fields hold values that are references to other data. The values are stored as IDs for rows of data in other objects within Salesforce. To import data into a reference field, you need to use a valid ID of a record in a valid reference object (each reference field may have one or more valid reference objects; the latter case is called polymorphism).
This is especially frustrating, and I’ll give you an example of why that is. If you want to import opportunities, you need to specify the appropriate account for each. The AccountId field for the opportunity record is a reference field that stores the ID of the corresponding account. In your data it may be easier to use the account name, for example Acme Inc. But before you import it, you’ll have to look up the corresponding ID for Acme Inc. in your Salesforce org. Multiply that by every value for every reference field in your import data. Ugh.
Picklists may be restricted or unrestricted. Restricted picklists require the value to be one of those listed in the field definition. If a value for a restricted picklist field in the import file does not match one of these accepted values, the default value for the field is used instead.
Unrestricted picklists allow you to add new values. So, if an unrestricted picklist field value in the import file doesn’t match one from the field’s list, it’s automatically added to the list of accepted values.
Picklists may also be multi-select and hold multiple values at once. In this case, multiple values in such a field in the import file should be separated by semicolons. Up to 100 values may be supplied in one entry; any more, and the field is left blank.
For checkbox fields, use the value 0 to signify unchecked, and 1 to signify checked.
The format in which the date or time is written must match Salesforce display settings for your locale. For example, in the United States, month is listed before day for a date. In Europe, however, day is listed before month in some formats.
Geolocation Custom Fields
The format should include latitude and longitude measurements, separated by a semicolon. Supplying more than two values for the field makes the record import fail. If only one value is supplied, it’s interpreted as the latitude and a zero longitude is assumed.
The format in which the currency is written must match Salesforce display settings for your locale. For example, in the United States, commas are used to separate thousands and periods are used to separate decimals. In some other countries, however, the reverse is true.
If you’re using multiple currencies, you’ll have to also include a data column with the appropriate ISO currency code. ISO codes are three-letter codes like USD for United States dollars. If you omit the currency code or the one you select is inactive, it’ll be replaced by your default personal currency code in Salesforce.
Salesforce performs validation on all records in the import before importing them. Any records failing validation will not be imported.
If you don’t use the correct format for a value, it will fail validation. If you try to assign a prohibited value, it will also fail validation. Data is rarely perfectly clean, so this will happen often and present a great deal of annoyance in your effort to get all your data imported.
Once you have your data in the correct format, Salesforce still needs to understand how to interpret it. The key issue is that the Data Import Wizard or Data Loader needs to know which field in the destination object should be filled with data from each column in your file. To provide this information, you need to map your import columns to the appropriate Salesforce fields.
In an ideal world, Salesforce would just understand which field you intend by looking at the column’s label in the header row. Your label there should correspond to the label of the destination field. In fact, Salesforce does attempt to do this, comparing column header labels to field labels in order to map them automatically. If the two labels match exactly, this works fine. But if they’re off even slightly, it can become much more difficult for Salesforce to map them correctly. Salesforce may either guess and get it wrong, or not venture a guess at all.
Both the Data Import Wizard and the Data Loader provide tools to change the import mapping of data column to field. You can use them to fix guesses Salesforce makes and to add mappings it misses. You can even save and load a set of mappings for use later when you want to import a similar data file.
What happens if you don’t provide a data column in your import file for every field in a record? If you’re creating a new record, unmapped fields are populated with their default value. If the field has no default value but is nillable (it can be empty), it receives a null value. Note that all columns present in your import file must be mapped to a field or Data Loader will throw an error.
Null values can be a little tricky to deal with. A null value just means that there is no value assigned to a field in a record, so it’s empty. Seems easy enough, but Salesforce may not be able to correctly interpret null values from the import file. Let’s go through an example of why that’s the case.
Let’s say you’re importing data into the Contacts object with an update procedure. The Contacts object already has a record for John Smith with a phone number stored in its Phone field. Your import file has a row for John Smith to update the record, and the row has an empty value for the Phone field. Does that empty Phone value in the file mean Salesforce should erase the phone number currently saved for John Smith and make it null? Or does it mean Salesforce should skip it and leave the value unmolested? You import file may only provide new data for some of the fields in a given record, after all, so you may intend for many of the fields to be skipped during the update.
The answer for how Salesforce interprets your empty value in the file depends on which tool you’re using for the import and which settings you’ve chosen. I’ll go through each.
If you use the Data Import Wizard, you cannot set any fields that currently contain data to null. If you’re creating an entirely new record and its default value is null, then it will wind up null automatically. But in our example of the Contact with a Phone value, you can’t erase the existing value and make it null.
If you use the Data Loader, the answer depends on which API you set it to use. If you use the SOAP API, there is a setting “Insert Null Values” that you can turn on or off. If you turn it on, an empty field in the import file will clear existing data in a record and make it null. If you turn it off, an empty field in the import file will not affect existing data in the record.
If you instead use the Bulk Data API, an empty field in the import file will never affect existing data in the record. But there’s still a way to set fields to null. To do so, use the value “#N/A” for the field in the import file instead of leaving the entry blank. The Data Loader will interpret that code as a signal to clear the record’s field of data.
There are limits to how much data you can import in a single go. Different limits apply to the import file, each record being imported, and each field within a record.
The import file can be up to 100 MB in size. Additionally, its size is also subject to your data storage limits. But keep in mind the storage size isn’t the same as the file size, so it can be hard to gauge with precision.
As we discussed earlier, there’s also a limit to how many records each file can import. With the Data Import Wizard, the limit is 50,000 records. With the Data Loader, it’s 5,000,000 records.
Each imported record can be up to 400 KB in total size, which amounts to around 4,000 characters. Each record can have at most 90 fields mapped for import.
We noted earlier that unrestricted multi-select picklists may specify up to 100 values for a field at a time. I just wanted to point that out again here since we’re talking about field limits.
Some fields, like notes and descriptions, allow for large amounts of data, but they still have limits. These fields can take up to 32 KB of data each. When field values exceed limits for the size of the data, they may either be truncated (the excess data simply clipped off), or the record operation may fail.
If you’re using the SOAP API, the Data Loader has a setting to specify whether you want to truncate them or produce an error. Select “Allow Field Truncation” on to truncate the data and accept the record, or off to prevent importing the record until you fix it. The latter is useful if you don’t want to accidentally lose data and not even be aware something was truncated. Note that this setting is not available in the Data Loader if you’re using the Bulk Data API. In that case, the load operation always fails for a record with data exceeding field value size limits.
Imports can include quite a bit of data, so Salesforce can’t always finish them in a blink of an eye. To manage the large number of CPU-taxing imports that customers request, all imports are put into a queue and handled in the order they’re received. Salesforce does claim that all imports are addressed within 24 hours, though that can feel like an eternity when you’re waiting on your data.
Once your import is complete, you’ll receive a confirmation email from Salesforce notifying you that it’s done. That doesn’t guarantee everything worked, but at least Salesforce is done trying.
If you’re an admin, you can check the status of a data import through the Bulk Data Load Jobs feature. In addition to status, it also lists key metrics related to the import. If you’re not an admin, it’s time to befriend one who can check for you! Or you can just leave the import tab open in your browser and keep checking to see if it’s finished, but that’s like watching a pot for the water to boil.
“Oops, I didn’t mean to do that.” When you’re dealing with a lot of data, that thought can be especially terrifying. If you’ve imported a bunch of records and now want to delete them, you have a couple options, depending on your permissions and the destination object for the records.
If you’re an admin, you can Mass Delete imported records through the user interface in Setup, but only if they were imported into one of these objects:
If you’ve imported records to a different object, you can use the Data Loader to perform a mass delete. Of course, you’ll still need permission to do so. As noted earlier, you’ll need the “Modify All Data” permission, which means you basically also must be an admin to do so.
There are a lot of things that can prevent an import from succeeding. Your data format may be incorrect, your mapping may be incomplete, you may be exceeding the limits on your data, or your values may be invalid. One common frustration is having to make sure you change every reference field value from a human-readable name to an inscrutable ID. Some people use lookup tables in Excel to make it easier, but that’s dangerous because the ID mappings in the spreadsheet may be out of date.
There’s an easier way. PowerUser’s AI can take care of a lot of the details for you automatically. Simply drag and drop the import file into the PowerUser chat and the AI will analyze it for import. It will check for common formatting and data issues to make sure all is ready. Wherever it can translate names to IDs for reference fields, it will do so. If there are any ambiguities, it’ll present you with a simple choice to direct what you want to happen, then remember your choices for later imports. Using PowerUser for bulk data import can cut out 80% of the time it takes to complete a successful import.
For more information or to get a demo, Contact Us at PowerUser.