Skip to content

Commit

Permalink
feat: added ape cache plugin (#680)
Browse files Browse the repository at this point in the history
* feat: added cache query

* chore: reformatting following pre-commit

* fix: mypy issues

* fix: used more secure way to query the database, and removed TODO

* fix: mypy and flake8 issues

* chore: clean up of docstrings

* fix: blocks database structure, need to work on updating by column

* feat: moved init_db and purge_db functionality to CacheQueryProvider

* feat: handles missing columns, database not existing, no nullable data to database, multiple databases depending on provider

* chore: pre-commit

* fix: cannot import Column from sqlalchemy.types, fixed this in models to remove warning

* chore: linting

* fix: added sqlalchemy to setup.py

* feat: removed schemas.py, no need for pydantic here, can go to the API's and build these models. Can add separate schemas later down the road.

* feat: allow to cache all data during query no matter what column you query

* mypy issue

* removed breakpoint

* removed TODO columns are no longer going to be handled in the update of cache

* moving DefaultQueryProvider back into managers/query.py

* type ignore for logger due to mypy

* fix: fixed query so it doesn't default to cache

* chore: type ignore issue

* fix: importing logger from ape.logging

* fix: isort issue

* feat: simplified logic in estimate_block_query

* fix: docs build issue

* fix: docs issue

* fix: importing logger mistake

* refactor: simplify logic in ape_cache query.py

* feat: caching data added

* fix: mypy issues

* fix: isort and mypy

* fix: mypy

* fix: documentation for `update_cache`

* fix: mypy

* fix: mypy

* fix: type in update cache

* fix: type in update cache

* fix: remove ignore

* feat: added transactions cache

* feat: created singledispatch for cache

* refactor: remove ContractEvents for now

* fix: cache was failing

* feat: added contract_events table to caching system

* fix: mypy

* refactor: remove print

* feat: major upgrade to CacheQueryProvider

* fix: mypy issue and linters

* fix: remove print

* feat: added raises

* fix: added missing argument to cache_query

* feat: added test and set logger to error

* feat: removed type ignore for sqlalchemy imports and added cache.md

* fix: fixed markdowns and fixed issue where db was being created without init

* fix: pytest failure because of raises in database_file method

* fix: misspelling in index.md

* fix: query manager was not using iterators

* refactor: major update to how cache provider works

* refactor: update ape cache cli to match

* feat: query is operating as expected

* fix: caching issue of integers being too large

* fix: type hint

* fix: isort

* fix: mypy

* fix: mypy

* fix: mypy

* fix: mypy

* fix: mypy for the last time please

* feat: added suggestions

* fix: formatting markdown and managers.query

* fix: suggested changes from fubu

* feat: adding errors to logging for QueryEngine

* fix: mypy and added type ignore to perform_query

* fix: num_transactions in the blocks table is now integer

* fix: CacheQuery docstring

* fix: show warnings instead of errors when db is not inited

* feat: add url for information on QueryEngineError

* feat: add url for information on QueryEngineError

* feat: add url for information on QueryEngineError

* fix: remove breakpoint

* feat: moved try for update_cache inside of the with statement

* feat: adding txn_hash and signature to Transactions

* feat: removed breakpoint

* fix: docstring

* feat: this would be a breaking change

* fix: cache markdown

* fix: remove bytes from transaction api

* refactor: remove type ignore

* refactor: raise message fix

* refactor: revert transaction api back to original form

* refactor: remove any from typing

* feat: getting all data for transactions

* fix: breakpoint removal

* feat: transaction cache full operational

* fix: type ignores for properties from transaction api

* fix: solve missing data by setting to None

* fix: unused import

* fix: mypy issue with table columns

* fix: black

* fix: wrong order of arguments for estimate query call

* fix: fetch estimate properly in CacheQueryProvider

* fix: wrong estimates in DefaultQueryProvider

estimates fixed for BlockTransactionQuery and ContractEventsQuery

* fix: unable to unpack the two different returns here

* fix: linters

* fix: blocks queries properly

* refactor: replace info with success messages

* refactor: adding singledispatchmethod for perform_query

* feat: transactions and blocks fully operational

* feat: handle individual column queries for blocks

* fix: mypy

* feat: properly calculate query time in DefaultQueryProvider

* feat: one liner for the estimate of query

* fix: cache init and purge messages in test

* fix: HexBytesString schema type doesn't handle string inputs

* fix: updates to ape_ethereum.Block to ensure that values always exist

also added some notes to describe this behavior

* fix: docstring format in HexByteString

* feat: added docstrings to all cli methods for ape cache

* feat: added beta release note to cache markdown file

* fix: query and purge docstrings

* fix: need to convert hash and parent hash to hex in decode_block

* fix: models must return bytes properly from new data type

* refactor: docstrings added to ape_cache.query and cleaned

* fix; database_connection should not be hidden

* fix: mypy issue

* fix: added minor changes to documentation

* fix: comments and some logic tweaks

* fix: perform_transaction_query uses a map

* feat: remove perform_query_clause

* fix: add required to network_option for init and purge

* refactor: raise Error instead of returning None when no cache clause

* feat: better bypass of database when corrupted or not initialized

* fix(HACK): bypass BlockAPI.transactions requests until #994

* fix: bypass database connection whenever connected to local

Co-authored-by: Just some guy <[email protected]>
  • Loading branch information
johnson2427 and fubuloubu authored Aug 18, 2022
1 parent 39c99ea commit 656b987
Show file tree
Hide file tree
Showing 15 changed files with 776 additions and 26 deletions.
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
userguides/installing_plugins
userguides/projects
userguides/compile
userguides/cache
userguides/networks
userguides/developing_plugins
userguides/config
Expand Down
69 changes: 69 additions & 0 deletions docs/userguides/cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Cache
** Note: This is in Beta release. This functionality is in constant development and many features are in planning stages.
Use the cache plugin to store provider data in a sqlite database.**

```bash
ape cache init --network <ecosystem-name>:<network-name>
```

If you want to set up your network connection for caching use [this guide](./network.html)

```bash
ape cache init --network ethereum:mainnet
```

This creates a SQLite database file in the hidden ape folder.

## Get data from the provider

Use `ape console`:

```bash
ape console --network ethereum:mainnet:infura
```

Run a few queries:

```python
In [1]: chain.blocks.query("*", stop_block=20)
In [2]: chain.blocks[-2].transactions
```

On a deployed contract, you can query events:
- Below, FooHappened is the event from your contract instance that you want to query from.

```python
contract_instance.FooHappened.query("*", start_block=-1)
```

where `contract_instance` is the return of owner.deploy(Contract)

See [this guide](../userguides/contracts.html) for more information how to get a contract instance.

Exit the IPython interpreter.

You can query the cache database directly, for debugging purposes. For example, to get the `blocks` table data from the SQLite db we can do the following:
```bash
ape cache query --network ethereum:mainnet:infura "SELECT * FROM blocks"
```

Returns:

```bash
hash num_transactions number parent_hash size timestamp gas_limit gas_used base_fee difficulty total_difficulty
0 b'\xd4\xe5g@\xf8v\xae\xf8\xc0\x10\xb8j@\xd5\xf... 0 0 b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00... 540 0 5000 0 None 17179869184 17179869184
1 b'\x88\xe9mE7\xbe\xa4\xd9\xc0]\x12T\x99\x07\xb... 0 1 b'\xd4\xe5g@\xf8v\xae\xf8\xc0\x10\xb8j@\xd5\xf... 537 1438269988 5000 0 None 17171480576 34351349760
2 b'\xb4\x95\xa1\xd7\xe6f1R\xae\x92p\x8d\xa4\x84... 0 2 b'\x88\xe9mE7\xbe\xa4\xd9\xc0]\x12T\x99\x07\xb... 544 1438270017 5000 0 None 17163096064 51514445824
```

To get `transactions` or `contract_events`:

```bash
ape cache query --network ethereum:mainnet:infura "SELECT * FROM transactions"
```

or

```bash
ape cache query --network ethereum:mainnet:infura "SELECT * FROM contract_events"
```
4 changes: 4 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,12 @@
"types-requests", # NOTE: Needed due to mypy typeshed
"types-pkg-resources", # NOTE: Needed due to mypy typeshed
"pandas-stubs==1.2.0.62", # NOTE: Needed due to mypy typeshed
"types-SQLAlchemy>=1.4.49",
"flake8>=4.0.1,<5.0", # Style linter
"flake8-breakpoint>=1.1.0,<2.0.0", # detect breakpoints left in code
"flake8-print>=4.0.0,<5.0.0", # detect print statements left in code
"isort>=5.10.1,<6.0", # Import sorting linter
"pandas-stubs==1.2.0.62", # NOTE: Needed due to mypy types
],
"doc": [
"myst-parser>=0.17.0,<0.18", # Tools for parsing markdown files in the docs
Expand Down Expand Up @@ -103,6 +105,7 @@
"pydantic>=1.9.2,<2",
"pygit2>=1.7.2,<2",
"PyGithub>=1.54,<2",
"SQLAlchemy>=1.4.35",
"pytest>=6.0,<8.0",
"python-dateutil>=2.8.2,<3",
"pyyaml>=6.0,<7",
Expand All @@ -129,6 +132,7 @@
"pytest11": ["ape_test=ape.pytest.plugin"],
"ape_cli_subcommands": [
"ape_accounts=ape_accounts._cli:cli",
"ape_cache=ape_cache._cli:cli",
"ape_compile=ape_compile._cli:cli",
"ape_console=ape_console._cli:cli",
"ape_plugins=ape_plugins._cli:cli",
Expand Down
1 change: 1 addition & 0 deletions src/ape/__modules__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
__modules__ = [
"ape",
"ape_accounts",
"ape_cache",
"ape_compile",
"ape_console",
"ape_ethereum",
Expand Down
5 changes: 4 additions & 1 deletion src/ape/api/providers.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,12 @@ class BlockAPI(BaseInterfaceModel):
An abstract class representing a block and its attributes.
"""

# NOTE: All fields in this class (and it's subclasses) should not be `Optional`
# except the edge cases noted below

num_transactions: int = 0
hash: Optional[Any] = None # NOTE: pending block does not have a hash
number: Optional[int] = None
number: Optional[int] = None # NOTE: pending block does not have a number
parent_hash: Any = Field(
EMPTY_BYTES32, alias="parentHash"
) # NOTE: genesis block has no parent hash
Expand Down
2 changes: 1 addition & 1 deletion src/ape/api/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ def perform_query(self, query: QueryType) -> Iterator:
Iterator
"""

def update_cache(self, query: QueryType, result: Iterator):
def update_cache(self, query: QueryType, result: Iterator[BaseInterfaceModel]):
"""
Allows a query plugin the chance to update any cache using the results obtained
from other query plugins. Defaults to doing nothing, override to store cache data.
Expand Down
5 changes: 2 additions & 3 deletions src/ape/managers/chain.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,12 +146,11 @@ def query(
)

blocks = self.query_manager.query(query, engine_to_use=engine_to_use)
data = map(lambda val: val.dict(by_alias=False), blocks)

# NOTE: Allow any columns from ecosystem's BlockAPI class
# TODO: fetch the block fields from EcosystemAPI
columns = validate_and_expand_columns(columns, list(self.head.__fields__)) # type: ignore
return pd.DataFrame(columns=columns, data=data)
blocks = map(lambda val: val.dict(by_alias=False), blocks) # type: ignore
return pd.DataFrame(columns=columns, data=blocks)

def range(
self,
Expand Down
47 changes: 31 additions & 16 deletions src/ape/managers/query.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
from itertools import tee
from typing import Dict, Iterator, Optional

from ape.api import QueryAPI, QueryType
from ape.api.query import BlockQuery, BlockTransactionQuery, ContractEventQuery
from ape.api import QueryAPI, QueryType, TransactionAPI
from ape.api.query import BaseInterfaceModel, BlockQuery, BlockTransactionQuery, ContractEventQuery
from ape.contracts.base import ContractLog, LogFilter
from ape.exceptions import QueryEngineError
from ape.logging import logger
from ape.plugins import clean_plugin_name
from ape.utils import ManagerAccessMixin, cached_property, singledispatchmethod

Expand All @@ -26,13 +28,13 @@ def estimate_block_query(self, query: BlockQuery) -> Optional[int]:

@estimate_query.register
def estimate_block_transaction_query(self, query: BlockTransactionQuery) -> int:

return 100
# NOTE: Very loose estimate of 1000ms per block for this query.
return self.provider.get_block(query.block_id).num_transactions * 100

@estimate_query.register
def estimate_contract_events_query(self, query: ContractEventQuery) -> int:
# NOTE: Very loose estimate of 100ms per block for this query.
return 100
return (query.stop_block - query.start_block) * 100

@singledispatchmethod
def perform_query(self, query: QueryType) -> Iterator: # type: ignore
Expand All @@ -43,12 +45,14 @@ def perform_block_query(self, query: BlockQuery) -> Iterator:
return map(
self.provider.get_block,
# NOTE: the range stop block is a non-inclusive stop.
# Where as the query method is an inclusive stop.
# Where the query method is an inclusive stop.
range(query.start_block, query.stop_block + 1, query.step),
)

@perform_query.register
def perform_block_transaction_query(self, query: BlockTransactionQuery) -> Iterator:
def perform_block_transaction_query(
self, query: BlockTransactionQuery
) -> Iterator[TransactionAPI]:
return self.provider.get_transactions_by_block(query.block_id)

@perform_query.register
Expand Down Expand Up @@ -91,30 +95,35 @@ def engines(self) -> Dict[str, QueryAPI]:

engines: Dict[str, QueryAPI] = {"__default__": DefaultQueryProvider()}

for plugin_name, (engine_class,) in self.plugin_manager.query_engines:
for plugin_name, engine_class in self.plugin_manager.query_engines:
engine_name = clean_plugin_name(plugin_name)
engines[engine_name] = engine_class()
engines[engine_name] = engine_class() # type: ignore

return engines

def query(self, query: QueryType, engine_to_use: Optional[str] = None) -> Iterator[QueryAPI]:
def query(
self,
query: QueryType,
engine_to_use: Optional[str] = None,
) -> Iterator[BaseInterfaceModel]:
"""
Args:
query (``QueryType``): The type of query to execute
engine_to_use (Optional[str]): Short-circuit selection logic using
a specific engine. Defaults to None.
a specific engine. Defaults is set by performance-based selection logic.
Raises: :class:`~ape.exceptions.QueryEngineError`: When given an
invalid or inaccessible ``engine_to_use`` value.
Returns:
Iterator[QueryAPI]
Iterator[BaseInterfaceModel]
"""

if engine_to_use:
if engine_to_use not in self.engines:
raise QueryEngineError(f"Query engine `{engine_to_use}` not found.")

engine = self.engines[engine_to_use]
selected_engine = self.engines[engine_to_use]

else:
# Get heuristics from all the query engines to perform this query
Expand All @@ -126,15 +135,21 @@ def query(self, query: QueryType, engine_to_use: Optional[str] = None) -> Iterat
try:
# Find the "best" engine to perform the query
# NOTE: Sorted by fastest time heuristic
engine, _ = min(valid_estimates, key=lambda qe: qe[1]) # type: ignore
selected_engine, _ = min(valid_estimates, key=lambda qe: qe[1]) # type: ignore

except ValueError as e:
raise QueryEngineError("No query engines are available.") from e

# Go fetch the result from the engine
result = engine.perform_query(query)
result = selected_engine.perform_query(query)

# Update any caches
for engine in self.engines.values():
engine.update_cache(query, result)
if not isinstance(engine, selected_engine.__class__):
result, cache_data = tee(result)
try:
engine.update_cache(query, cache_data)
except QueryEngineError as err:
logger.error(str(err))

return result
18 changes: 18 additions & 0 deletions src/ape_cache/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from ape import plugins
from ape.api import PluginConfig

from .query import CacheQueryProvider


class CacheConfig(PluginConfig):
size: int = 1024**3 # 1gb


@plugins.register(plugins.Config)
def config_class():
return CacheConfig


@plugins.register(plugins.QueryPlugin)
def query_engines():
return CacheQueryProvider
82 changes: 82 additions & 0 deletions src/ape_cache/_cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
import click
import pandas as pd

from ape import networks
from ape.cli import NetworkBoundCommand, network_option
from ape.logging import logger
from ape.utils import ManagerAccessMixin


def get_engine():
return ManagerAccessMixin.query_manager.engines["cache"]


@click.group(short_help="Query from caching database")
def cli():
"""
Manage query caching database (beta).
"""


@cli.command(short_help="Initialize a new cache database")
@network_option(required=True)
def init(network):
"""
Initializes an SQLite database and creates a file to store data
from the provider.
Note that ape cannot store local data in this database. You have to
give an ecosystem name and a network name to initialize the database.
"""

provider = networks.get_provider_from_choice(network)
ecosystem_name = provider.network.ecosystem.name
network_name = provider.network.name

get_engine().init_database(ecosystem_name, network_name)
logger.success(f"Caching database initialized for {ecosystem_name}:{network_name}.")


@cli.command(
cls=NetworkBoundCommand,
short_help="Call and print SQL statement to the cache database",
)
@network_option()
@click.argument("query_str")
def query(query_str, network):
"""
Allows for a query of the database from an SQL statement.
Note that without an SQL statement, this method will not return
any data from the caching database.
Also note that an ecosystem name and a network name are required
to make the correct connection to the database.
"""

with get_engine().database_connection as conn:
results = conn.execute(query_str).fetchall()
if results:
click.echo(pd.DataFrame(results))


@cli.command(short_help="Purges entire database")
@network_option(required=True)
def purge(network):
"""
Purges data from the selected database instance.
Note that this is a destructive purge, and will remove the database file from disk.
If you want to store data in the caching system, you will have to
re-initiate the database following a purge.
Note that an ecosystem name and network name are required to
purge the database of choice.
"""

provider = networks.get_provider_from_choice(network)
ecosystem_name = provider.network.ecosystem.name
network_name = provider.network.name

get_engine().purge_database(ecosystem_name, network_name)
logger.success(f"Caching database purged for {ecosystem_name}:{network_name}.")
17 changes: 17 additions & 0 deletions src/ape_cache/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from typing import Any

from sqlalchemy.ext.declarative import as_declarative, declared_attr


@as_declarative()
class Base:
"""
Base class to generate ``__tablename__`` automatically
"""

id: Any
__name__: str

@declared_attr
def __tablename__(cls) -> str:
return cls.__name__.lower()
Loading

0 comments on commit 656b987

Please sign in to comment.