Sforce Data Loader: Using the command line interface

by Andrew Waite on June 13, 2005 at 03:13 PM

The Sforce Data Loader can be scripted to run from a command line by following these steps:

1. Prepare your system
To use the Command Line Interface ("CLI") of the Sforce Data Loader your target machine (the machine upon which the loader will run via the CLI) needs to have the Java Runtime Environment ("JRE") v1.4.2_03 or greater installed and configured. JRE 1.5 is not supported.

The CLI of the Sforce Data Loader uses the same processing engine as the Graphical User Interface ("GUI") provided in the Windows installation.

In order to use the CLI you must first install the Sforce Data Loader on your Windows computer using the .exe provided in setup of Salesforce.  

If you intend to use the CLI of the Sforce Data Loader on the same machine on which it is installed you need not move any files. In this case you can skip to "3. Setting operation parameters" below.

If the target machine for your scheduled process is not the same as where you installed the GUI using the .exe then you need to perform step 2.

2. Copy the loader engine and files

Once the GUI is installed, open the directory you selected during install, the default is:

C:\Program Files\salesforce.com\Sforce Data Loader

Copy the sforcedataloader.jar file located in this directory the working directory of your batch process on the target machine. In this working directory, create a folder named "conf" and copy everything in C:\Program Files\salesforce.com\Sforce Data Loader\conf to this directory.

Alternatively, you can  download SforceDataLoader.zip which  contains the jar file and conf  directory for those that can't easily install the windows gui.

3. Setting operation parameters:
Depending on your operation you must provide a minimum set of parameters to run. These parameters can be specified in either the config.properties file in your conf directory or as command line arguments. Modify the parameters according to your operation and desired behavior. Sample required parameters for a basic extract operation of Account records are as follows:

# Required parameters for processing an Extract operation
username=you@yourcompany.com
password=yourpassword
operation=extract
extractionSOQL=SELECT Id,Name FROM Account
extractionTarget=account-extract.csv

Insert/Update/Delete operations have a separate set of required parameters. A sample of the required parameters for a basic input operation for Account records are as follows:

# Required parameters for processing an Insert operation
username=you@yourcompany.com
password=yourpassword
operation=inserts
batchSize=200
mappingFile=conf/default.sdl
dataAccessObject=account-insert.csv
entity=Account
outputSuccess=success.csv
outputError=error.csv

Best Practice: Run the process once manually using the GUI, specifying the parameters as desired. The settings from the manual execution will be saved to:

C:\Documents and Settings\<windows username>\Application Data\salesforce.com\Sforce Data Loader\config.properties

If the operation is not an extract, be sure to save your map as well('default.sdl' above).

Copy that config.properties file and your map (.sdl) to the conf directory in the working directory of your target machine.

Modify the values in config.properties to contain valid system paths for the appropriate files on your target system (the examples here assume a linux/unix system).

4. Prepare and execute

Refer to the section titled, "Using the Command Line Interface" in the Sforce Data Loader User Guide for detailed information to prepare the command and run the data loader from the command line. 

After the load is completed, the program will export your parameters to the config.properties file. This process will not include any comments. If you want to maintain your properties for reference, save them under a different file name. Note: this resave will not include the password property.

Best Practice: If your process is repetitive you should be passing the password as a command-line argument.

You should also pass in the outputSuccess and outputError parameters when performing inserts/updates/deletes and extractionTarget when performing extract as the loader will overwrite files found to have the same filename.

5. Parameter Reference

####### BEGIN FILE  ############################
# Loader Config Parameters
#
# This can be copied as your sample config.properties file. Lines
# beginning with a '#' indicate comments and will be ignored by the
# program.

###### Required for All operations ####################
#
# Salesforce user account credentials. Your account must be active
# and you will see better performance if you run as a System
# Administrator
username=you@yourcompany.com
password=yourpassword

# The operation type being performed. A valid value is
# 1 of: extract, inserts, updates, deletes
operation=extract

#
####### Required only for extract ###################
#
# The query that defines the scope and type of data to be returned
# from sforce. For more information on the Sforce Object Query Language
# please see the API documentation on sforce.com
extractionSOQL=SELECT Id,Name FROM Account

# The file that will contain the data extracted from Salesforce
extractionTarget=account-extract.csv

#
####### Required only for Inserts/Updates/Deletes ###########
#
# The path to the file that contains the field mapping for your
# operation.  It is recommended that you use the Sforce Data Loader
# GUI to generate the mapping file. Note: you must map the Id field in
# updates/deletes operations and you must NOT do so in inserts
mappingFile=conf/default.sdl

# The path to the file containing the source data
dataAccessObject=account-update.csv

# The name of the entity that is the target of the data. This is
# any accessible object in the API. Note: if this is a custom object
# you must append the '__c' suffix (two under-scores, then the letter c)
entity=Account

# The path to the file that contains the success output
outputSuccess=success.csv

# The path to the file that contains the error output
outputError=error.csv

# The size (number of records) of each request
batchSize=200

#
####### Optional parameters #####################
#
# Please refer to the Sforce Data Loader User Guide for more
# information regarding these parameters (default values are shown)

#endpoint=https://www.salesforce.com
#lastBatchRow=0
#assignmentRule=
#rowToStartAt=0
#extractionRequestSize=2000
#insertNulls=false
#timeoutSecs=60
#proxyPassword=
#proxyHost=
#proxyPort=
#
###### END OF FILE #######################

Comments

Posted by Basil Karam on June 16, 2005 09:27 PM:

When I execute the command prompt to extract data, the command window returns 1000's of debug statements. How can I turn this off

Posted by Andrew Waite on June 16, 2005 10:19 PM:

Thanks for the comment Basil. Make sure to take note of the information in the User Guide per step 4., above.

If you set the salesforce.config.dir variable and the directory you supply has the log-conf.xml file in it you should not get this level of logging.

Happy Loading!

Posted by Cecile Franchini on June 27, 2005 07:35 AM:

Is there a way to encrypt the password?

Posted by Rup on June 28, 2005 11:17 AM:

OK, how does this work on Mac OSX (Tiger 10.4.1, Java 1.4.2_07) ?

Posted by Andrew Waite on June 28, 2005 05:14 PM:

This will test the promise of java :) you should be able to follow the same instructions per above. If you don't have access to a windows machine to install the loader and get the necessary files I will update the blog with a zip file containing what you need.

Posted by Tom Greenbank on July 6, 2005 03:47 AM:

For the record this works fine on Linux.

Posted by Roland B on July 13, 2005 08:17 AM:

I'm trying to figure out how to schedule a load/extract jobs within the Sforce data loader

Posted by Andrew Waite on July 13, 2005 08:36 AM:

Roland, you will need to write a script (.sh or .bat) file and then have a system scheduler call it. There is no scheduling mechanism inside the loader itself.

Posted by Pierre Eymard on July 22, 2005 09:42 AM:

Are you planning on updating the blog page, for instance by completing the explanation of all the parameters in conf.properties ? Also, could you make clearer (in red) that the operations are named insertS, updateS,.. and write a paragraph for the required parameters for deletes and updates ? Thanks.

Posted by Narayana on August 3, 2005 04:50 PM:

Hi,

When I try to use CLI of the sforcedata loader I am getting the error as listed below:-

Exception in thread "main" java.util.zip.ZipException: The system cannot find the file specified
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(Unknown Source)
at java.util.jar.JarFile.(Unknown Source)
at java.util.jar.JarFile.(Unknown Source)

Posted by Narayana on August 3, 2005 07:37 PM:

Exception in thread "main" java.lang.OutOfMemoryError
this error is thowring up each time after downloading 1000 Records.Any Ideas?

Posted by Andrew Waite on August 3, 2005 08:36 PM:

Pierre,

Update and Delete Operations carry the same required parameters as Insert. The only difference is in the requirements for the data file and changing the operation paramater accordingly. For Update and Delete operations you need a column with the Salesforce Id for each record being updated or deleted. In the case of deletes, the Salesforce Id is all you need in the file.

As for your other request, you should be able to find the answers in the Sforce Data Loader User Guide. The link to it is provided in the blog.

Regards,
Andrew

Posted by Andrew Waite on August 3, 2005 08:41 PM:

Narayana,

I trust from your second message that you are no longer seeing the error reported in the first.

Regarding out of memory, if you are extracting lots of columns and including Description or any Custom Long Text Area fields with lots of data you might see this error with the default heap size.

There is an example of how to set the heap size in the user guide at the top of page 10.

Hope that helps.

Regards,
Andrew

Posted by Narayana on August 3, 2005 10:16 PM:

Andrew,

Thanks for the info.I got it working.

Is their any help on SOQL. I need to compare the date with the cuurent date functions

Posted by Andrew Waite on August 3, 2005 10:28 PM:

Best bet is to review the API documentation:
http://www.sforce.com/us/docs/sforce60/wwhelp/wwhimpl/js/html/wwhelp.htm

You want Sforce API Calls > Sforce Object Query Language (SOQL) > fieldExpressionSyntax then scroll down and look for "Date Formats"

Regards and happy loading!
Andrew

Posted by 7LS on August 16, 2005 05:04 PM:

When I try running this as described in the instructions above, i get an error as follows:

Exception in thread "main" java.lang.NoClassDefFoundError: ?jar

What does this mean?

Posted by Tony Melendez on August 19, 2005 09:23 AM:

Hello Andrew,
The info posted is was very helpful, if you could post an example of a shell script, .bat, where you would actually call the jar file with parms.....a small or complicated one would be great.....
I am going to use the bat to schedule the loader within windows 2000
thanks
tony

Posted by Bal on August 31, 2005 04:33 AM:

Hi Chaps,

1) I want to upload some data from our systems - g8.
2) Then update the data using the data loader.

However I don't want to use the salesforce id (cos I dont know what it may be).
Whats the best way of using the upload facility of the data uploader without mapping the salesforceid.

Cheers
Bal

Posted by Andrew Waite on August 31, 2005 08:04 AM:

Bal,

Thanks for your message. Unfortunately at this time the only way to update existing records is to use the Update operation which requires the salesforce ID for each record.

If you are using the loader to insert the records initially, the success.csv file produced is appended with the appropriate salesforce ID for each row inserted. This establishes a mapping of your records to the salesforce ID.

Additionally, the source code for the loader is available in sourceforge if you want to enhance it to update records based on an ID you do know. We would welcome the contribution to the community. If anyone is interested in working on this please respond here.

Thanks and happy loading,
Andrew

Posted by Michael Markham on August 31, 2005 12:00 PM:

I'm lovin' the SForce Data Loader -- it's running like a champ for me, even my scheduled tasks that fire every night. What a tremendous contribution to the salesforce user community!

The big enhancement I'd like to see would be a connector for ODBC. Anyone thought about adding this to the project yet? I probably don't have the skills to build it at this point, but I'll keep my eye on that goal.

Actually, I'm more even more interested in developing an equivalent CLI tool that runs under PHP 5.1 that _would_ connect to either CSV, MySQL or Oracle. I'll gladly build that tool!

Posted by Bal on September 1, 2005 09:04 AM:

Cheers for the reply Andy.

We've managed to get around it by downloading the salesforceids(+ associated our company id) into a holding table and then simply including them in any future joins.

Subject change!
I've tried referencing the following to get rid of the amount of logging that I am getting but to no avail.
"If you set the salesforce.config.dir variable and the directory you supply has the log-conf.xml file in it you should not get this level of logging."

My batch contains the following:
(Im executing the jar file from the installed folder..sforcedataloader\bin....etc)

-jar sforcedataloader.jar -Dsalesforce.config.dir=d:\sforceworking mappingfile=INSERTcontacts.sdl
entity=contacts
dataAccessObject=contact.csv
outputSuccess=success.csv
outputError=error.csv

And the folder d:\sforceworking does contain a copy of the log-conf.xml in it.

Help!

Cheers
Bal

Posted by Tony Melendez on September 6, 2005 02:10 PM:

Great thanks for the tips on the page.
One last thing, I got a pretty hand batch job, that I created to handle some dynamic sql that gets passed to the loader and have made some directories to handle which config file to use...Also I did have to update the config file via the delims function before the loader get to it and place my sql within it.
I was verifing the data and noticed that loader is not detecting the first record in my csv file, it went un-noticed because the success messages were not indicating any error, I still went to check the error file and behold one record that didn't make it....the first one.....I have two different csv files, both of them process ok without any errors, until you query for the first record in the csv file on you salesforce db. Please I need help in resolving if anyone knows how fix this in the most simple way. thanks, gracias.

Posted by Andrew Waite on September 6, 2005 02:34 PM:

Tony,

Thanks for the feedback!

The command-line interface of the loader assumes the same requirements for your data as does the GUI. Specifically, the GUI expects your first row of data to be the header, i.e. field labels for your columns.

If you do have a header row, you could still see this behavior if you are setting the attribute "rowToStartAt" to 1 (one). If you are doing this, then change it to 0 (zero) or remove it from your configuration and all your data should be processed.

Andrew

Posted by Tony Melendez on September 6, 2005 03:02 PM:

Thanks Andrew!

Posted by Cliff Armstrong on September 26, 2005 03:06 PM:

I'm attempting to run the data loader via command line Java on Linux and I'm getting the following results:

2005-09-26 16:18:35,048 INFO [main] Controller.LoaderController initConfig (LoaderController.java:297) - The log has been initialized

2005-09-26 16:18:35,065 INFO [main] LoaderEngineRunner main (LoaderEngineRunner.java:72) - Initializing engine

2005-09-26 16:18:35,066 INFO [main] LoaderEngineRunner main (LoaderEngineRunner.java:79) - Loading parameters

2005-09-26 16:18:35,066 INFO [main] LoaderEngineRunner main (LoaderEngineRunner.java:90) - Logging in to salesforce.com

2005-09-26 16:18:38,024 ERROR [main] client.PartnerClient connect (PartnerClient.java:142) - ; nested exception is:
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: No trusted certificate found
2005-09-26 16:18:38,025 ERROR [main] LoaderEngineRunner main (LoaderEngineRunner.java:120) - ; nested exception is:
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: No trusted certificate found

I've seen earlier posts that indicate everything works fine on Linux with no mention of using the Java keytool - does anyone have suggestions as to how I can fix this problem?

Thanks.

Post a comment

If you have a TypeKey or TypePad account, please Sign In