As noted the other day (Using SalesforceDX to automate getting Apex class test coverage percentages) SalesforceDX is great for many things but one of the ways is automate some operations that are time consuming or just takes a lot of manual work each time. One of these things are Bulk API operations which in of by themselves are not hard but there is no UI for them besides the DataLoader and no console API when using the DataLoader when not on Windows.
The customer I’m working for currently has a monster data load to perform and one of the things I’ve done is writing script to split the data into data sets. One set per country – 91 sets all in all. All sets consists of 3 files to support the data load. One file for Accounts and two additional files for custom objects that needs to be loaded as well. All in all it’s a lot of clicking in the DataLoader and it doesn’t really scale for testing. That’s a lot of clicking in DataLoader when testing.
But I’m lucky as SalesforceDX receives new functionality all the time and at some point some data Bulk API features had snuck by me so I was pleasantly surprised to discover force:data:bulk:upsert and force:data:bulk:delete today. They we just what I needed. SalesforceDX to the rescue yet again…
So today I grabbed by IDE by the horns (#vscode in my case) and wrote some wrappers around the Bulk API capabilities of SalesforceDX. The fact that all SalesforceDX commands takes an optional –json argument makes it easy to script and parse responses. This combined with select-shell from npm I now have a nice CLI interface to doing Bulk data loads. The script looks as the available data sets and asks me what country to load data for and then what export timestamp to process (the data sets may exists in multiple versions). Then it goes and does its thing UPSERTing all 3 times in turn and reports status. So nice. The Bulk API is asynchronous so the script also handles polling for job status and only proceeds once the job has completed successfully.
$ ./upsert mheisterberg@example.com.appdev SFDX - Org for mheisterberg@example.com.appdev is connected... Select country code: ae au ca cn es fr hk hu co ▸ it jp kr my pt sg th tr tw us Select timestamp: 2018-04-16T07:25:37Z 2018-04-16T08:31:28Z ▸ 2018-04-16T08:34:14Z Will process following data Country : it Timestamp: 2018-04-16T08:34:14Z UPSERT for Account data... Issued UPSERT bulk request to object (Account) - id 7516E000002DckQQAS, jobId 7506E000002QQ3zQAG - state: Queued SFDX - asking for bulk status for id 7516E000002DckQQAS, jobId 7506E000002QQ3zQAG SFDX - received bulk status for id 7516E000002DckQQAS, jobId 7506E000002QQ3zQAG - state: Completed Issued UPSERT bulk request to object (MarketRelation__c) - id 7516E000002DckkQAC, jobId 7506E000002QQ4JQAW - state: Queued SFDX - asking for bulk status for id 7516E000002DckkQAC, jobId 7506E000002QQ4JQAW SFDX - received bulk status for id 7516E000002DckkQAC, jobId 7506E000002QQ4JQAW - state: Completed Issued UPSERT bulk request to object (Consent__c) - id 7516E000002DckuQAC, jobId 7506E000002QQ4OQAW - state: Queued SFDX - asking for bulk status for id 7516E000002DckuQAC, jobId 7506E000002QQ4OQAW SFDX - received bulk status for id 7516E000002DckuQAC, jobId 7506E000002QQ4OQAW - state: Completed Finished upsert of data
Once I’m done testing a particular data set I can use the –delete-accounts flag to my script to delete data using the Bulk API as well. Here I actually combined force:data:soql:query and force:data:bulk:delete to first retrieve the ID’s of the records I need to delete and then kick off the required Bulk API delete requests. Again easy peasy. And repeatable…
$ ./upsert mheisterberg@example.com.appdev --delete-accounts SFDX - Org for mheisterberg@example.com.appdev is connected... Are you sure? ▸ No Yes Received 32463 records Issued DELETE bulk request to object (Account) - id 7516E000002DckLQAS, jobId 7506E000002QQ3uQAG - state: Queued SFDX - asking for bulk status for id 7516E000002DckLQAS, jobId 7506E000002QQ3uQAG SFDX - received bulk status for id 7516E000002DckLQAS, jobId 7506E000002QQ3uQAG - state: InProgress SFDX - asking for bulk status for id 7516E000002DckLQAS, jobId 7506E000002QQ3uQAG SFDX - received bulk status for id 7516E000002DckLQAS, jobId 7506E000002QQ3uQAG - state: Completed Performed delete...
Only issue I had here really was that node.js has a maximum buffer size of 200kb to stdin so I could not simply read the stdin response from the SOQL query as it may be pretty big. Instead I pipe to a tmp-file and read that back in and parse as JSON. Not ideal but it gets the job done.
The code of the script itself is for the customers eyes only but the source for the helpers is available as sfdx-bulk-helper on Github and sfdx-bulk-helper on npm.
YMMV!