Tutorial¶
Installing requirements¶
For this tutoral, apart from the Save-to-DB library itself, we need SLQAlchemy library for working with database. Install both using this command:
pip install Save-to-DB SQLAlchemy
First steps¶
In this tutorial we are going to save and update stock data into a database. Our database schema shown on Fig. 1.
Lets start writing code. Create a new folder called “stdb_tutorial”, then create a file that defines our database connection called “conf.py”. We are going to use SQLite database in this tutorial. The code of the file shown in Listing 1.
from save_to_db import Persister
from save_to_db.adapters import SqlalchemyAdapter
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine("sqlite:///database.db", echo=False)
DBSession = sessionmaker(bind=engine)
session = DBSession()
ModelBase = declarative_base()
persister = Persister(SqlalchemyAdapter({"session": session, "ModelBase": ModelBase}))
Add a file that defines our models according to the schema, see Listing 2.
import sqlalchemy as sa
from sqlalchemy.orm import relationship
from conf import ModelBase
class Exchange(ModelBase):
__tablename__ = "exchange"
name = sa.Column(sa.String(), primary_key=True)
class Company(ModelBase):
__tablename__ = "company"
name = sa.Column(sa.String(), primary_key=True)
class Instrument(ModelBase):
__tablename__ = "instrument"
isin_code = sa.Column(sa.String(length=12), primary_key=True)
company_name = sa.Column(
sa.String(), sa.ForeignKey(Company.name, ondelete="CASCADE"), primary_key=True
)
company = relationship(Company, backref="instruments")
class Dividend(ModelBase):
__tablename__ = "dividend"
instrument_isin_code = sa.Column(
sa.String(length=12),
sa.ForeignKey(Instrument.isin_code, ondelete="CASCADE"),
primary_key=True,
)
instrument = relationship(Instrument, backref="dividends")
date = sa.Column(sa.Date(), primary_key=True)
amount = sa.Column(sa.Float(), nullable=False)
class Ticker(ModelBase):
__tablename__ = "ticker"
code = sa.Column(sa.String(), primary_key=True)
exchange_name = sa.Column(
sa.String(), sa.ForeignKey(Exchange.name, ondelete="CASCADE"), primary_key=True
)
exchange = relationship(Exchange, backref="tickers")
instrument_isin_code = sa.Column(
sa.String(length=12),
sa.ForeignKey(Instrument.isin_code, ondelete="CASCADE"),
nullable=False,
)
instrument = relationship(Instrument, backref="tickers")
And finally, before we can move to actually saved data, a short file with Save-to-DB items in it, Listing 3.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | import models
from save_to_db import item_cls_manager, Item
item_cls_manager.autogenerate = True
class ExchangeItem(Item):
allow_merge_items = True
model_cls = models.Exchange
class CompanyItem(Item):
allow_merge_items = True
model_cls = models.Company
class DividendItem(Item):
model_cls = models.Dividend
class InstrumentItem(Item):
allow_merge_items = True
model_cls = models.Instrument
getters = ["tickers"]
getters_autoconfig = True
class TickerItem(Item):
allow_merge_items = True
model_cls = models.Ticker
|
Note we have not defined items for all the models. We set item_cls_manager.autogenerate on line 4 to True, the missing items will be auto-generated as needed with default parameters.
Now open a terminal, navigate to “stdb_tutorial” folder and type “python” command to open python interactive shell. Lets explore one of our items:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | >>> import items
>>> items.InstrumentItem()
>>> items.InstrumentItem.print_cls_config()
{'aliases': {},
'allow_merge_items': False,
'allow_multi_update': False,
'autoinject_creators': True,
'batch_size': None,
'conversions': {'boolean_false_strings': ('false', 'no', 'off', '0', '-'),
'boolean_true_strings': ('true', 'yes', 'on', '1', '+'),
'date_formats': '%Y-%m-%d',
'datetime_formats': '%Y-%m-%d %H:%M:%S',
'decimal_separator': '.',
'time_formats': '%H:%M:%S'},
'creators': [{'company', 'isin_code'}],
'creators_autoconfig': None,
'defaults': {},
'fields': {'isin_code': <ColumnType.STRING: 3>},
'getters': [{'tickers'}, {'company', 'isin_code'}],
'getters_autoconfig': True,
'metadata': {'autogenerated_item_cls': False,
'collection_id': None,
'setup_completed': True},
'model_cls': <class 'models.Instrument'>,
'norewrite_fields': {},
'nullables': set(),
'relations': {'company': {'item_cls': <class 'items.CompanyItem'>,
'relation_type': <RelationType.MANY_TO_ONE: 3>,
'replace_x_to_many': False,
'reverse_key': 'instruments'},
'dividends': {'item_cls': <class 'items.DividendItem'>,
'relation_type': <RelationType.ONE_TO_MANY: 2>,
'replace_x_to_many': False,
'reverse_key': 'instrument'},
'tickers': {'item_cls': <class 'save_to_db.core.item_metaclass.TickerItem'>,
'relation_type': <RelationType.ONE_TO_MANY: 2>,
'replace_x_to_many': False,
'reverse_key': 'instrument'}},
'update_only_mode': False}
|
On line 2 we instantiate InstrumentItem class to trigger configuration completion. This also creates missing TickerItem class used in relations. You can read about item configuration in the Item chapter.
Lets load some data into our database, create a new file called “load_initial.py”, see Listing 4.
import items
from conf import engine, ModelBase, persister
ModelBase.metadata.create_all(engine)
# The data can come from anywhere, for demontration purpose we hard-coded it
COMPANY_DATA = [
# COMPANY, INSTRUMENT ISIN, TICKER
["Polymetal International", "JE00B6T5S470", "POLY"],
["Qiwi Plc", "US74735M1080", "QIWI"],
]
DIVIDEND_DATA = [
# INSTRUMENT ISIN, DATE, AMOUNT
["JE00B6T5S470", "2018-05-11", "0.3"],
["JE00B6T5S470", "2017-05-05", "0.18"],
["US74735M1080", "2017-08-29", "0.21"],
["US74735M1080", "2017-05-29", "0.22"],
]
# The exchange on which the company stocks are traded
exchange = items.ExchangeItem(name="Moscow Exchange")
companies = items.CompanyItem.Bulk()
for (company_name, instrument_isin_code, ticker_code) in COMPANY_DATA:
company = companies.gen(name=company_name)
# Since each company has many instruments, "instruments" is a bulk item.
# Items are automatically created when you try to access them.
instrument = company["instruments"].gen(isin_code=instrument_isin_code)
instrument["tickers"].gen(exchange=exchange, code=ticker_code)
# For demontration purposes, loading dividends here
for (instrument_isin_code, date, amount) in (
row for row in DIVIDEND_DATA if row[0] == instrument_isin_code
):
instrument["dividends"].gen(date=date, amount=amount)
persister.persist(companies)
persister.commit()
After running the code, you will have the data in the database.
Now let’s take a close look at how the library worked. First it converted all item fields string values to appropriated data types according to conversions configuration and completed relationships, they are shown on Fig. 2.
Actual saving and updating is done in groups of the same model classed. Grouped this way, our items look like this:
When persisting out bulk item, Save-to-DB library first put all our items and all referenced item into separate lists according to model class. The it tried to get data, fist all companies with one query, then instruments, etc. Since our database is empty, no data was retrieved. After getting data, it starts to create new data in the database if possible. In our case the process when like this:
- First the library tried to create new company data in DB, since we have enough data (according to “creators” configuration value), library issued insert statements for both companies.
- Since we already have company models, both instruments models can be persisted to the database.
- Same logic applies to dividends.
- In case of tickers, at this stage we do not have exchange model yet, so they are skipped for now.
- Exchange model created.
- Ticker models created.
Simply saving new data into a database is not difficult, you could use pure SQLAlchemy with approximately the same amount of code. Now let’s run another code, see Listing 5.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | import items
from conf import persister
# Note that there is no instrument data here
DIVIDEND_DATA = [
# TICKER DATE, AMOUNT
["QIWI", "2007-03-28", "0.19"], # new data
["QIWI", "2017-05-29", "10.0"], # data to be updated
]
exchange = items.ExchangeItem(name="Moscow Exchange")
# setting fields on a bulk items sets default values for the items
# (you can also use double underscore to access subfields)
dividends = items.DividendItem.Bulk(instrument__tickers__exchange=exchange)
for (ticker_code, date, amount) in DIVIDEND_DATA:
dividend = dividends.gen(date=date, amount=amount)
# using double underscore to access subfields
dividend["instrument__tickers"].gen(code=ticker_code)
persister.persist(dividends)
persister.commit()
|
Note that there is no instrument data here, remember that we added “tickers” as getters to the instrument item class (tickers have a reference to instruments and can be got using “exchange” and “ticker_code” values). Save-to-DB library will not add it automatically as it is a one-to-many relationship, in our case we assume that it is impossible for a stock that belongs to one company today, to belong to another company tomorrow.
Note
On line 18 where we use instrument__tickers key, we have different dividend item each time, meaning that different instrument and ticker item will be generated. Items will be merged into one, since allow_merge_items is set to True for the classes.
If you run the code and look into the database, you’ll see that the updated and newly created data is there.