This gem allows you to import a big amount of data from files to your database as quick as possible.
Supose you have a CSV file with 500k+ rows. Or even better: 2kk rows. Go, import it. Yes, I know, you can do it... right?. Now, do it in Rails... systematically :) This task can be very upset and slow. You need a bulk import operation. Thats is it. Thanks to BulkImporter you only need to write a few sentences and you'll get your data imported and updated.
By now, we only support PostgreSQL 9.4+ databases thanks to the COPY command.
Add this line to your application's Gemfile:
gem 'bulk_importer'
And then execute:
$ bundle
Or install it yourself as:
$ gem install bulk_importer
Suppose your CSV file /tmp/names.csv (delimited by TABs) has the columns: ID and LOVELY_NAME, and you need to import this information into your names table, who has the fields id and name. Also, you want to keep your prexistent data but updated it if necessary.
csv_file = File.new '/tmp/names.csv'
csv_columns = {
'ID' => id,
'LOVELY_NAME' => name
}
# Suppose your table's key is the 'id' column.
keys = csv_columns.first
BulkImporter.import_from_csv(
'names',
csv_file,
csv_columns,
keys,
delimiter: '\t',
# UPDATE_MODE_UPDATE inserts the new data and update
# preexistent (it search by keys).
update_mode: BulkImportModule::UPDATE_MODE_UPDATE
)
Thats all! The imported data will be inserted in your names table in a few seconds.
After checking out the repo, run bin/setup
to install dependencies. Then, run rake test
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/abelosorio/bulk_importer.
The gem is available as open source under the terms of the MIT License.
- Write more tests!
- Decouple database queries. This gem should work with any database engine. Yes, even MySQL.
- Since PostgreSQL 9.5 we could implement the UPDATE_MODE_UPDATE method by using INSERT command with the ON CONFLICT option.
- Support different database schemas.