Load - writing tables to files and databases

The following functions write data from a table to a file-like source or database. For functions that accept a source argument, if the source argument is None or a string it is interpreted as follows:

  • None - write to stdout
  • string ending with ‘.gz’ or ‘.bgz’ - write to file via gzip decompression
  • string ending with ‘.bz2’ - write to file via bz2 decompression
  • any other string - write directly to file

Some helper classes are also available for writing to other types of file-like sources, e.g., writing to a Zip file or string buffer, see the section on I/O helper classes for more information.

Delimited files

petl.tocsv(table, source=None, dialect=<class csv.excel at 0x7fc97ffd2ef0>, **kwargs)

Write the table to a CSV file. E.g.:

>>> from petl import tocsv, look
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> tocsv(table, 'test.csv')
>>> # look what it did
... from petl import fromcsv
>>> look(fromcsv('test.csv'))
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | '1'   |
+-------+-------+
| 'b'   | '2'   |
+-------+-------+
| 'c'   | '2'   |
+-------+-------+

The filename argument is the path of the delimited file, all other keyword arguments are passed to csv.writer(). So, e.g., to override the delimiter from the default CSV dialect, provide the delimiter keyword argument.

Note that if a file already exists at the given location, it will be overwritten.

Supports transparent writing to .gz and .bz2 files.

petl.appendcsv(table, source=None, dialect=<class csv.excel at 0x7fc97ffd2ef0>, **kwargs)

Append data rows to an existing CSV file. E.g.:

>>> # look at an existing CSV file
... from petl import look, fromcsv
>>> testcsv = fromcsv('test.csv')
>>> look(testcsv)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | '1'   |
+-------+-------+
| 'b'   | '2'   |
+-------+-------+
| 'c'   | '2'   |
+-------+-------+

>>> # append some data
... look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'd'   | 7     |
+-------+-------+
| 'e'   | 42    |
+-------+-------+
| 'f'   | 12    |
+-------+-------+

>>> from petl import appendcsv 
>>> appendcsv(table, 'test.csv')
>>> # look what it did
... look(testcsv)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | '1'   |
+-------+-------+
| 'b'   | '2'   |
+-------+-------+
| 'c'   | '2'   |
+-------+-------+
| 'd'   | '7'   |
+-------+-------+
| 'e'   | '42'  |
+-------+-------+
| 'f'   | '12'  |
+-------+-------+

The filename argument is the path of the delimited file, all other keyword arguments are passed to csv.writer(). So, e.g., to override the delimiter from the default CSV dialect, provide the delimiter keyword argument.

Note that no attempt is made to check that the fields or row lengths are consistent with the existing data, the data rows from the table are simply appended to the file. See also the cat() function.

Supports transparent writing to .gz and .bz2 files.

petl.totsv(table, source=None, dialect=<class csv.excel_tab at 0x7fc97ffd2f58>, **kwargs)

Convenience function, as tocsv() but with different default dialect (tab delimited).

Supports transparent writing to .gz and .bz2 files.

New in version 0.9.

petl.appendtsv(table, source=None, dialect=<class csv.excel_tab at 0x7fc97ffd2f58>, **kwargs)

Convenience function, as appendcsv() but with different default dialect (tab delimited).

Supports transparent writing to .gz and .bz2 files.

New in version 0.9.

petl.toucsv(table, source=None, dialect=<class csv.excel at 0x7fc97ffd2ef0>, encoding='utf-8', **kwargs)

Write the table to a CSV file via the given encoding. Like tocsv() but accepts an additional encoding argument which should be one of the Python supported encodings. See also codecs.

New in version 0.19.

petl.appenducsv(table, source=None, dialect=<class csv.excel at 0x7fc97ffd2ef0>, encoding='utf-8', **kwargs)

Append the table to a CSV file via the given encoding. Like appendcsv() but accepts an additional encoding argument which should be one of the Python supported encodings. See also codecs.

New in version 0.19.

petl.toutsv(table, source=None, dialect=<class csv.excel_tab at 0x7fc97ffd2f58>, **kwargs)

Convenience function, as toucsv() but with different default dialect (tab delimited).

New in version 0.19.

petl.appendutsv(table, source=None, dialect=<class csv.excel_tab at 0x7fc97ffd2f58>, **kwargs)

Convenience function, as appenducsv() but with different default dialect (tab delimited).

New in version 0.19.

Pickle files

petl.topickle(table, source=None, protocol=-1)

Write the table to a pickle file. E.g.:

>>> from petl import topickle, look
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> topickle(table, 'test.dat')
>>> # look what it did
... from petl import frompickle
>>> look(frompickle('test.dat'))
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

Note that if a file already exists at the given location, it will be overwritten.

The pickle file format preserves type information, i.e., reading and writing is round-trippable.

Supports transparent writing to .gz and .bz2 files.

petl.appendpickle(table, source=None, protocol=-1)

Append data to an existing pickle file. E.g.:

>>> from petl import look, frompickle
>>> # inspect an existing pickle file
... testdat = frompickle('test.dat')
>>> look(testdat)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> # append some data
... from petl import appendpickle
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'd'   | 7     |
+-------+-------+
| 'e'   | 42    |
+-------+-------+
| 'f'   | 12    |
+-------+-------+

>>> appendpickle(table, 'test.dat')
>>> # look what it did
... look(testdat)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+
| 'd'   | 7     |
+-------+-------+
| 'e'   | 42    |
+-------+-------+
| 'f'   | 12    |
+-------+-------+

Note that no attempt is made to check that the fields or row lengths are consistent with the existing data, the data rows from the table are simply appended to the file. See also the cat() function.

Supports transparent writing to .gz and .bz2 files.

Text files

petl.totext(table, source=None, template=None, prologue=None, epilogue=None)

Write the table to a text file. E.g.:

>>> from petl import totext, look    
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> prologue = """{| class="wikitable"
... |-
... ! foo
... ! bar
... """
>>> template = """|-
... | {foo}
... | {bar}
... """
>>> epilogue = "|}"
>>> totext(table, 'test.txt', template, prologue, epilogue)
>>> 
>>> # see what we did
... with open('test.txt') as f:
...     print f.read()
...     
{| class="wikitable"
|-
! foo
! bar
|-
| a
| 1
|-
| b
| 2
|-
| c
| 2
|}

The template will be used to format each row via str.format.

Supports transparent writing to .gz and .bz2 files.

petl.appendtext(table, source=None, template=None, prologue=None, epilogue=None)

Append the table to a text file.

New in version 0.19.

petl.toutext(table, source=None, encoding='utf-8', template=None, prologue=None, epilogue=None)

Write the table to a text file via the given encoding. Like totext() but accepts an additional encoding argument which should be one of the Python supported encodings. See also codecs.

New in version 0.19.

petl.appendutext(table, source=None, encoding='utf-8', template=None, prologue=None, epilogue=None)

Append the table to a text file via the given encoding. Like appendtext() but accepts an additional encoding argument which should be one of the Python supported encodings. See also codecs.

New in version 0.19.

JSON files

petl.tojson(table, source=None, prefix=None, suffix=None, *args, **kwargs)

Write a table in JSON format, with rows output as JSON objects. E.g.:

>>> from petl import tojson, look
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> tojson(table, 'example.json')
>>> # check what it did
... with open('example.json') as f:
...     print f.read()
... 
[{"foo": "a", "bar": 1}, {"foo": "b", "bar": 2}, {"foo": "c", "bar": 2}]

Note that this is currently not streaming, all data is loaded into memory before being written to the file.

Supports transparent writing to .gz and .bz2 files.

New in version 0.5.

petl.tojsonarrays(table, source=None, prefix=None, suffix=None, output_header=False, *args, **kwargs)

Write a table in JSON format, with rows output as JSON arrays. E.g.:

>>> from petl import tojsonarrays, look
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> tojsonarrays(table, 'example.json')
>>> # check what it did
... with open('example.json') as f:
...     print f.read()
... 
[["a", 1], ["b", 2], ["c", 2]]

Note that this is currently not streaming, all data is loaded into memory before being written to the file.

Supports transparent writing to .gz and .bz2 files.

New in version 0.11.

HTML files

petl.tohtml(table, source=None, caption=None, representation=<type 'str'>, lineterminator='\r\n')

Write the table as HTML to a file. E.g.:

>>> from petl import tohtml, look    
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> tohtml(table, 'test.html')

New in version 0.12.

Changed in version 0.17.1: Added support for caption keyword argument to provide table caption in output

petl.touhtml(table, source=None, caption=None, encoding='utf-8', representation=<type 'unicode'>, lineterminator=u'\r\n')

Write the table as Unicode HTML to a file.

New in version 0.19.

Databases

petl.tosqlite3(table, filename_or_connection, tablename, create=False, commit=True)

Load data into a table in an sqlite3 database. Note that if the database table exists, it will be truncated, i.e., all existing rows will be deleted prior to inserting the new data. E.g.:

>>> from petl import tosqlite3, look
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

>>> # by default, if the table does not already exist, it will be created
... tosqlite3(table, 'test.db', 'foobar')
>>> # look what it did
... from petl import fromsqlite3
>>> look(fromsqlite3('test.db', 'select * from foobar'))
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| u'a'  | 1     |
+-------+-------+
| u'b'  | 2     |
+-------+-------+
| u'c'  | 2     |
+-------+-------+

If the table does not exist and create=True then a table will be created using the field names in the table header. However, note that no type specifications will be included in the table creation statement and so column type affinities may be inappropriate.

Changed in version 0.10.2.

Either a database file name or a connection object can be given as the second argument.

Changed in version 0.21.

Default value for create argument changed to False.

petl.appendsqlite3(table, filename_or_connection, tablename, commit=True)

Load data into an existing table in an sqlite3 database. Note that the database table will be appended, i.e., the new data will be inserted into the table, and any existing rows will remain. E.g.:

>>> from petl import appendsqlite3, look
>>> look(moredata)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'd'   | 7     |
+-------+-------+
| 'e'   | 9     |
+-------+-------+
| 'f'   | 1     |
+-------+-------+

>>> appendsqlite3(moredata, 'test.db', 'foobar') 
>>> # look what it did
... from petl import look, fromsqlite3
>>> look(fromsqlite3('test.db', 'select * from foobar'))
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| u'a'  | 1     |
+-------+-------+
| u'b'  | 2     |
+-------+-------+
| u'c'  | 2     |
+-------+-------+
| u'd'  | 7     |
+-------+-------+
| u'e'  | 9     |
+-------+-------+
| u'f'  | 1     |
+-------+-------+

Changed in version 0.10.2.

Either a database file name or a connection object can be given as the second argument.

petl.todb(table, dbo, tablename, schema=None, commit=True)

Load data into an existing database table via a DB-API 2.0 connection or cursor. Note that the database table will be truncated, i.e., all existing rows will be deleted prior to inserting the new data. E.g.:

>>> from petl import look, todb
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

... using sqlite3:

>>> import sqlite3
>>> connection = sqlite3.connect('test.db')
>>> # assuming table "foobar" already exists in the database
... todb(table, connection, 'foobar')    

... using psycopg2:

>>> import psycopg2 
>>> connection = psycopg2.connect("dbname=test user=postgres")
>>> # assuming table "foobar" already exists in the database
... todb(table, connection, 'foobar')    

... using MySQLdb:

>>> import MySQLdb
>>> connection = MySQLdb.connect(passwd="moonpie", db="thangs")
>>> # tell MySQL to use standard quote character
... connection.cursor().execute('SET SQL_MODE=ANSI_QUOTES')
>>> # load data, assuming table "foobar" already exists in the database
... todb(table, connection, 'foobar')    

N.B., for MySQL the statement SET SQL_MODE=ANSI_QUOTES is required to ensure MySQL uses SQL-92 standard quote characters.

Changed in version 0.10.2.

A cursor can also be provided instead of a connection, e.g.:

>>> import psycopg2 
>>> connection = psycopg2.connect("dbname=test user=postgres")
>>> cursor = connection.cursor()
>>> todb(table, cursor, 'foobar')    
petl.appenddb(table, dbo, tablename, schema=None, commit=True)

Load data into an existing database table via a DB-API 2.0 connection or cursor. Note that the database table will be appended, i.e., the new data will be inserted into the table, and any existing rows will remain. E.g.:

>>> from petl import look, appenddb
>>> look(table)
+-------+-------+
| 'foo' | 'bar' |
+=======+=======+
| 'a'   | 1     |
+-------+-------+
| 'b'   | 2     |
+-------+-------+
| 'c'   | 2     |
+-------+-------+

... using sqlite3:

>>> import sqlite3
>>> connection = sqlite3.connect('test.db')
>>> # assuming table "foobar" already exists in the database
... appenddb(table, connection, 'foobar')    

... using psycopg2:

>>> import psycopg2 
>>> connection = psycopg2.connect("dbname=test user=postgres")
>>> # assuming table "foobar" already exists in the database
... appenddb(table, connection, 'foobar')    

... using MySQLdb:

>>> import MySQLdb
>>> connection = MySQLdb.connect(passwd="moonpie", db="thangs")
>>> # tell MySQL to use standard quote character
... connection.cursor().execute('SET SQL_MODE=ANSI_QUOTES')
>>> # load data, appending rows to table "foobar" 
... appenddb(table, connection, 'foobar')    

N.B., for MySQL the statement SET SQL_MODE=ANSI_QUOTES is required to ensure MySQL uses SQL-92 standard quote characters.

Changed in version 0.10.2.

A cursor can also be provided instead of a connection, e.g.:

>>> import psycopg2 
>>> connection = psycopg2.connect("dbname=test user=postgres")
>>> cursor = connection.cursor()
>>> appenddb(table, cursor, 'foobar')