spanner-loader

This directory contains a python script that can be used to import data into Cloud Spanner. The script reads a gzipped csv file from a Google Cloud Storage bucket and a local schema file, and then inserts the data into a specified Spanner table in batches.

1. Create a Cloud Spanner Table

Follow the steps on the Spanner Quickstart to create your spanner instance, database and table.

2. Create a schema for your Spanner Table

Use the sample.schema to define the schema for the table that you are going to load. Use a colon ( : ) to specify the data type for each of your columns and a comma ( , ) to separate each of the columns on your table.

For example for a table with two STRING columns, named one and two, this would be the corresponding schema.

one:STRING,two:STRING

3. Create a Service Account (optional)

Note: This step is not required in the event that you have configured appropriate account and project configuration using the gcloud SDK, or are running the tool from a GCE instance within the target project with a service account that has appropriate permissions for the Spanner instance being targeted. In these cases, the tool will pick-up the configuration from the environment automatically.

Optionally, create a service account to be used by the spanner client library for authentication against your Spanner instance.

Follow the steps described in Creating a Service Account to create a Service Account for this purpose. Once you have created your service account follow the steps described in Creating a Service Account Key to create a key for the service account you just created and finally follow these steps to grant permissions to the service account.

Make sure to use a role with read and write access to Spanner, like Cloud Spanner Database User for example. You can have more information on the Cloud Spanner Roles here.

4. Usage

Note: this tool requires Python 3

Install the requirements for the python script by executing the following command:

pip3 install -r requirements.txt

Execute the spanner-loader python script with the required arguments.

python spanner-loader.py --instance_id=[Your Cloud Spanner instance ID] --database_id=[Your Cloud Spanner database ID] --table_id=[Your table name] --batchsize=[The number of rows to insert in a batch] --bucket_name=[The name of the bucket for the source file] --file_name=[The csv input data file] --schema_file=[The format file describing the input data file]

Optional parameters:

--delimiter=[The delimiter used between columns in source file]
--project_id=[Your Google Cloud Project id]
--path_to_credentials=[Path to the json file with the credentials]

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
sample.schema		sample.schema
spanner-loader.py		spanner-loader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spanner-loader

Table of Contents

1. Create a Cloud Spanner Table

2. Create a schema for your Spanner Table

3. Create a Service Account (optional)

4. Usage

About

Releases

Packages

Contributors 3

Languages

jdabello/spanner-loader

Folders and files

Latest commit

History

Repository files navigation

spanner-loader

Table of Contents

1. Create a Cloud Spanner Table

2. Create a schema for your Spanner Table

3. Create a Service Account (optional)

4. Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages