Tutorial

Installing requirements

For this tutoral, apart from the Save-to-DB library itself, we need SLQAlchemy library for working with database. Install both using this command:

pip install Save-to-DB SQLAlchemy

First steps

In this tutorial we are going to save and update stock data into a database. Our database schema shown on Fig. 1.

Database schema

Fig. 1 Database schema.

We have companies that issued stocks (instruments). Companies pay dividends on theirs stocks, and the stocks are traded on exchanges under assigned ticker codes.

Lets start writing code. Create a new folder called “stdb_tutorial”, then create a file that defines our database connection called “conf.py”. We are going to use SQLite database in this tutorial. The code of the file shown in Listing 1.

Listing 1 “stdb_tutorial/conf.py” file content.
from save_to_db import Persister
from save_to_db.adapters import SqlalchemyAdapter

from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

engine = create_engine("sqlite:///database.db", echo=False)
DBSession = sessionmaker(bind=engine)
session = DBSession()
ModelBase = declarative_base()

persister = Persister(SqlalchemyAdapter({"session": session, "ModelBase": ModelBase}))

Add a file that defines our models according to the schema, see Listing 2.

Listing 2 “stdb_tutorial/models.py” file content.
import sqlalchemy as sa
from sqlalchemy.orm import relationship
from conf import ModelBase


class Exchange(ModelBase):
    __tablename__ = "exchange"
    name = sa.Column(sa.String(), primary_key=True)


class Company(ModelBase):
    __tablename__ = "company"
    name = sa.Column(sa.String(), primary_key=True)


class Instrument(ModelBase):
    __tablename__ = "instrument"
    isin_code = sa.Column(sa.String(length=12), primary_key=True)
    company_name = sa.Column(
        sa.String(), sa.ForeignKey(Company.name, ondelete="CASCADE"), primary_key=True
    )
    company = relationship(Company, backref="instruments")


class Dividend(ModelBase):
    __tablename__ = "dividend"
    instrument_isin_code = sa.Column(
        sa.String(length=12),
        sa.ForeignKey(Instrument.isin_code, ondelete="CASCADE"),
        primary_key=True,
    )
    instrument = relationship(Instrument, backref="dividends")
    date = sa.Column(sa.Date(), primary_key=True)
    amount = sa.Column(sa.Float(), nullable=False)


class Ticker(ModelBase):
    __tablename__ = "ticker"
    code = sa.Column(sa.String(), primary_key=True)
    exchange_name = sa.Column(
        sa.String(), sa.ForeignKey(Exchange.name, ondelete="CASCADE"), primary_key=True
    )
    exchange = relationship(Exchange, backref="tickers")
    instrument_isin_code = sa.Column(
        sa.String(length=12),
        sa.ForeignKey(Instrument.isin_code, ondelete="CASCADE"),
        nullable=False,
    )
    instrument = relationship(Instrument, backref="tickers")

And finally, before we can move to actually saved data, a short file with Save-to-DB items in it, Listing 3.

Listing 3 “stdb_tutorial/items.py” file content.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import models
from save_to_db import item_cls_manager, Item

item_cls_manager.autogenerate = True


class ExchangeItem(Item):
    allow_merge_items = True
    model_cls = models.Exchange


class CompanyItem(Item):
    allow_merge_items = True
    model_cls = models.Company


class DividendItem(Item):
    model_cls = models.Dividend


class InstrumentItem(Item):
    allow_merge_items = True
    model_cls = models.Instrument
    getters = ["tickers"]
    getters_autoconfig = True


class TickerItem(Item):
    allow_merge_items = True
    model_cls = models.Ticker

Note we have not defined items for all the models. We set item_cls_manager.autogenerate on line 4 to True, the missing items will be auto-generated as needed with default parameters.

Now open a terminal, navigate to “stdb_tutorial” folder and type “python” command to open python interactive shell. Lets explore one of our items:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
>>> import items
>>> items.InstrumentItem()
>>> items.InstrumentItem.print_cls_config()
{'aliases': {},
 'allow_merge_items': False,
 'allow_multi_update': False,
 'autoinject_creators': True,
 'batch_size': None,
 'conversions': {'boolean_false_strings': ('false', 'no', 'off', '0', '-'),
                 'boolean_true_strings': ('true', 'yes', 'on', '1', '+'),
                 'date_formats': '%Y-%m-%d',
                 'datetime_formats': '%Y-%m-%d %H:%M:%S',
                 'decimal_separator': '.',
                 'time_formats': '%H:%M:%S'},
 'creators': [{'company', 'isin_code'}],
 'creators_autoconfig': None,
 'defaults': {},
 'fields': {'isin_code': <ColumnType.STRING: 3>},
 'getters': [{'tickers'}, {'company', 'isin_code'}],
 'getters_autoconfig': True,
 'metadata': {'autogenerated_item_cls': False,
              'collection_id': None,
              'setup_completed': True},
 'model_cls': <class 'models.Instrument'>,
 'norewrite_fields': {},
 'nullables': set(),
 'relations': {'company': {'item_cls': <class 'items.CompanyItem'>,
                           'relation_type': <RelationType.MANY_TO_ONE: 3>,
                           'replace_x_to_many': False,
                           'reverse_key': 'instruments'},
               'dividends': {'item_cls': <class 'items.DividendItem'>,
                             'relation_type': <RelationType.ONE_TO_MANY: 2>,
                             'replace_x_to_many': False,
                             'reverse_key': 'instrument'},
               'tickers': {'item_cls': <class 'save_to_db.core.item_metaclass.TickerItem'>,
                           'relation_type': <RelationType.ONE_TO_MANY: 2>,
                           'replace_x_to_many': False,
                           'reverse_key': 'instrument'}},
 'update_only_mode': False}

On line 2 we instantiate InstrumentItem class to trigger configuration completion. This also creates missing TickerItem class used in relations. You can read about item configuration in the Item chapter.

Lets load some data into our database, create a new file called “load_initial.py”, see Listing 4.

Listing 4 “stdb_tutorial/load_initial.py” file content.
import items
from conf import engine, ModelBase, persister

ModelBase.metadata.create_all(engine)

# The data can come from anywhere, for demontration purpose we hard-coded it
COMPANY_DATA = [
    # COMPANY,                   INSTRUMENT ISIN,    TICKER
    ["Polymetal International", "JE00B6T5S470", "POLY"],
    ["Qiwi Plc", "US74735M1080", "QIWI"],
]
DIVIDEND_DATA = [
    # INSTRUMENT ISIN,   DATE,            AMOUNT
    ["JE00B6T5S470", "2018-05-11", "0.3"],
    ["JE00B6T5S470", "2017-05-05", "0.18"],
    ["US74735M1080", "2017-08-29", "0.21"],
    ["US74735M1080", "2017-05-29", "0.22"],
]

# The exchange on which the company stocks are traded
exchange = items.ExchangeItem(name="Moscow Exchange")
companies = items.CompanyItem.Bulk()
for (company_name, instrument_isin_code, ticker_code) in COMPANY_DATA:
    company = companies.gen(name=company_name)
    # Since each company has many instruments, "instruments" is a bulk item.
    # Items are automatically created when you try to access them.
    instrument = company["instruments"].gen(isin_code=instrument_isin_code)
    instrument["tickers"].gen(exchange=exchange, code=ticker_code)
    # For demontration purposes, loading dividends here
    for (instrument_isin_code, date, amount) in (
        row for row in DIVIDEND_DATA if row[0] == instrument_isin_code
    ):
        instrument["dividends"].gen(date=date, amount=amount)

persister.persist(companies)
persister.commit()

After running the code, you will have the data in the database.

Now let’s take a close look at how the library worked. First it converted all item fields string values to appropriated data types according to conversions configuration and completed relationships, they are shown on Fig. 2.

Completed item relations

Fig. 2 Completed item relations.

We have two companies C, each has one instrument I, each instrument has data about two dividends D. Instruments also have tickers T under which they are being traded on an exchange E.

Actual saving and updating is done in groups of the same model classed. Grouped this way, our items look like this:

Completed item relations

Fig. 3 Items grouped by model class.

When persisting out bulk item, Save-to-DB library first put all our items and all referenced item into separate lists according to model class. The it tried to get data, fist all companies with one query, then instruments, etc. Since our database is empty, no data was retrieved. After getting data, it starts to create new data in the database if possible. In our case the process when like this:

  1. First the library tried to create new company data in DB, since we have enough data (according to “creators” configuration value), library issued insert statements for both companies.
  2. Since we already have company models, both instruments models can be persisted to the database.
  3. Same logic applies to dividends.
  4. In case of tickers, at this stage we do not have exchange model yet, so they are skipped for now.
  5. Exchange model created.
  6. Ticker models created.

Simply saving new data into a database is not difficult, you could use pure SQLAlchemy with approximately the same amount of code. Now let’s run another code, see Listing 5.

Listing 5 “stdb_tutorial/load_second.py” file content.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import items
from conf import persister

# Note that there is no instrument data here
DIVIDEND_DATA = [
    # TICKER     DATE,            AMOUNT
    ["QIWI", "2007-03-28", "0.19"],  # new data
    ["QIWI", "2017-05-29", "10.0"],  # data to be updated
]

exchange = items.ExchangeItem(name="Moscow Exchange")
# setting fields on a bulk items sets default values for the items
# (you can also use double underscore to access subfields)
dividends = items.DividendItem.Bulk(instrument__tickers__exchange=exchange)
for (ticker_code, date, amount) in DIVIDEND_DATA:
    dividend = dividends.gen(date=date, amount=amount)
    # using double underscore to access subfields
    dividend["instrument__tickers"].gen(code=ticker_code)

persister.persist(dividends)
persister.commit()

Note that there is no instrument data here, remember that we added “tickers” as getters to the instrument item class (tickers have a reference to instruments and can be got using “exchange” and “ticker_code” values). Save-to-DB library will not add it automatically as it is a one-to-many relationship, in our case we assume that it is impossible for a stock that belongs to one company today, to belong to another company tomorrow.

Note

On line 18 where we use instrument__tickers key, we have different dividend item each time, meaning that different instrument and ticker item will be generated. Items will be merged into one, since allow_merge_items is set to True for the classes.

If you run the code and look into the database, you’ll see that the updated and newly created data is there.