Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto-batching functionality #71

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 86 additions & 0 deletions src/autoBatchNode.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@

const Endpoint = require('./endpoint')
const _ = require('lodash')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You don't use it here so it can be removed, but just FYI in 2021 using Lodash is pretty much always an antipattern)


// This helper function returns all method names including those inherited from extending classes
function getClassMethodNames(klass) {
const isFunction = (x, name) => typeof x[name] === 'function';
const deepFunctions = x =>
x !== Object.prototype &&
Object.getOwnPropertyNames(x)
.filter(name => isFunction(x, name))
.concat(deepFunctions(Object.getPrototypeOf(x)) || []);
const distinctDeepFunctions = klass => Array.from(new Set(deepFunctions(klass)));
return distinctDeepFunctions(klass).filter(name => name !== 'constructor' && !name.startsWith('__'))
}

module.exports = class AutoBatchNode {
constructor(parent, client) {
this.parent = parent
this.client = client
this.children = {}

getClassMethodNames(parent).forEach((key) => {
this[key] = (...args) => {
// Enforce settings from the Client
this._syncSettingsFromClient()

// Return endpoint node from the child pool, if available
const child = this._getChildNode(key, args)
if (child) {
return child
}

// Call the parent's method
const result = parent[key](...args)

// If the method returns parent, return this
if (result === this.parent) {
return this
}
// If method returns a new Endpoint, add it to the child pool and return it
else if (result instanceof Endpoint) {
return this._newChildNode(key, args, result)
}
// Otherwise, return the result of the method
else {
return result
}
}
})
}

_syncSettingsFromClient() {
// If this is the client, don't need to change settings
if (this.parent === this.client) {
return
}
// Copy the settings from client to this endpoint
this.parent
.schema(this.client.schemaVersion)
.language(this.client.lang)
.authenticate(this.client.apiKey)
.debugging(this.client.debug)
// Autobatching endpoints should always use cache
this.parent._skipCache = false

return this
}

_newChildNode(endpointName, args, endpoint) {
endpoint.enableAutoBatch()

const childNode = new AutoBatchNode(endpoint, this.client)
const nameList = [endpointName, ...args]
const childName = nameList.join('_')
this.children[childName] = childNode
return childNode
}

_getChildNode(endpointName, args) {
const nameList = [endpointName, ...args]
const childName = nameList.join('_')
return this.children[childName]
}

}
Copy link
Owner

@queicherius queicherius Jul 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say the entire wrapping logic here feels a little over-engineered. What benefits does it have over just passing a autoBatching = true flag, just like we pass flags for e.g. debug?

It feels like that would ultimately achieve the same functionality and would e.g. also make it possible to pass the autoBatchDelay to the client if we did something like this:

// Client

autoBatch (flag = true, autoBatchDelay = 50) {
  this.autoBatching = flag
  this.autoBatchDelay = autoBatchDelay
  return this
}
// Endpoint

constructor (parent) {
  // ...
  this.autoBatching = parent.autoBatching
  this.autoBatchDelay = parent.autoBatchDelay

  this._autoBatch = {
    idsForNextBatch: new Set(),
    nextBatchPromise: null,
  }
}

autoBatch (flag = true, autoBatchDelay = 50) {
  // We could also print the debug warning message here, if this.isBulk = false
  // that the flag will just be ignored. But in some cases it might make sense anyway,
  // like this one where it gets passed to a child:
  // client().account().autoBatch(true).characters().many([...])
  // So IMO it would be fine to just set the flag and silently ignore it (see below)
  // if the endpoint is not bulk.

  this.autoBatching = flag
  this.autoBatchDelay = autoBatchDelay
  return this
}

_many (ids, partialRequest = false, skipAutoBatch = false) {
  if (this.isBulk && this.autoBatching && !skipAutoBatch) {
    return this._autoBatchMany(ids)
  }
}
// Usage
const api = client().autoBatch(true, 50)
api.items().many([12]) // -> autobatches
api.account().bank().get() // -> doesn't
api.items().autoBatch(false).many([12]) // -> doesn't
client().items().autoBatch(true).get(12) // -> does

Since the autobatching endpoints are static, their settings such as apikey and schema should not be changed individually.

I don't think this is an issue, we need to handle. If the user mutates the language on an already existing endpoint with autobatching enabled, while a request is in-flight, I feel like that's on them.

Auto batching endpoints also override the _skipCache setting which is normally changed by .live(), to force cache usage.

live() should win over everything, it is forced to overwrite the cache usage and immediately fire the request.

Copy link
Author

@ShinyMeta ShinyMeta Jul 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it would be cleaner if autobatch were handled via setting flag, however I think it could be confusing. For example:

let api = client().autoBatch()
api.items().get(101)
api.items().get(102)

This normal behavior of the library is to return a new endpoint from items(), which would have seperate autoBatch sets of ids with my current endpoint proposed edit.
In my mind, if the client is set to autoBatch, then subsequent calls for the same endpoint should also autoBatch. The main thing the wrapper is doing is caching child endpoints so that subsequent calls are returning the same endpoint objects. For the most part, other method calls are passed through without interference.

The reason I added logic to re-inherit settings from the parent, was probably over-cautious, and unnecessary. I was worried that people might try to change settings on the endpoint like this:

await api.items().live().get(101)
await api.items().get(101)

Which is fine with the current library because it makes a new endpoint on each line so the second line will use the cache. But if the api is returning saved endpoints from a pool for autobatching, then the setting stays. This can be avoided with proper usage of the library, and removing it cuts some fluff from my edit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This normal behavior of the library is to return a new endpoint from items(), which would have seperate autoBatch sets of ids with my current endpoint proposed edit. In my mind, if the client is set to autoBatch, then subsequent calls for the same endpoint should also autoBatch.

Ah! I didn't think about batching being shared between endpoints of the same type. That's very valid, and I agree that it should work how you described.

The main thing the wrapper is doing is caching child endpoints so that subsequent calls are returning the same endpoint objects. For the most part, other method calls are passed through without interference.

Instead of wrapping the client and returning the same endpoint objects, what do you think about a singleton shared data object that holds the ID list and timeout based on the endpoint URL? That should solve the issues around the shared settings and still have the simple interface I suggested.

Something like this?

const autoBatchSharedData = {}

module.exports = class AbstractEndpoint {
  constructor(parent) {
    // ...

    this.setupAutoBatchSharedData()
  }

  // Setup the shared data for auto batching
  setupAutoBatchSharedData() {
    const autoBatchId = this.baseUrl + this.url + ':' + this.schemaVersion
    if (!autoBatchSharedData[autoBatchId]) {
      autoBatchSharedData[autoBatchId] = {
        idsForNextBatch: new Set(),
        nextBatchPromise: null,
      }
    }

    this._autoBatch = autoBatchSharedData[autoBatchId]
  }

  // Set the schema version
  schema(schema) {
    this.schemaVersion = schema
    this.setupAutoBatchSharedData()
    this.debugMessage(`set the schema to ${schema}`)
    return this
  }
}

The reason I added logic to re-inherit settings from the parent, was probably over-cautious, and unnecessary. I was worried that people might try to change settings on the endpoint like this:

await api.items().live().get(101)
await api.items().get(101)

Yeah, I would also expect the first one to go through and the second one to hit the cache.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been staring at this and wondering why I didn't think of that design before.

Should language/apikey be added to the batchId with schema or remove schema altogether?
If we care, it should probably match _cacheHash. However this will only cause an issue if the user changes the setting inside of the batch window, which is 50ms by default... Probably just a result of irresponsible use, and could be a consideration in documentation about autoBatching

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should language/apikey be added to the batchId with schema or remove schema altogether?
If we care, it should probably match _cacheHash. However this will only cause an issue if the user changes the setting inside of the batch window, which is 50ms by default...

I think using _cacheHash is a good idea! Then it'd behave correctly in all instances (without race conditions) and the only overhead would be to add the extra function call to the setters for language & api key. Then if a user changes the settings in the batch window it doesnt get batched but it gets the correct result.

7 changes: 7 additions & 0 deletions src/client.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ const fetch = require('lets-fetch')
const nullCache = require('./cache/null')
const endpoints = require('./endpoints')
const flow = require('./flow')
const AutoBatchNode = require('./autoBatchNode')

module.exports = class Client {
constructor () {
Expand All @@ -12,6 +13,7 @@ module.exports = class Client {
this.caches = [nullCache()]
this.debug = false
this.client = this
this.autoBatchPool = new AutoBatchNode(this, this)
}

// Set the schema version
Expand Down Expand Up @@ -79,6 +81,11 @@ module.exports = class Client {
})
}

// Maintains a pool of static endpoints with auto-batching enabled
autoBatch() {
return this.autoBatchPool
}

// All the different API endpoints
account () {
return new endpoints.AccountEndpoint(this)
Expand Down
71 changes: 66 additions & 5 deletions src/endpoint.js
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ module.exports = class AbstractEndpoint {
this.isOptionallyAuthenticated = false
this.credentials = false

this._autoBatch = null
this.autoBatchDelay = 50

this._skipCache = false
}

Expand Down Expand Up @@ -74,6 +77,23 @@ module.exports = class AbstractEndpoint {
return this
}

// Turn on auto-batching for this endpoint
enableAutoBatch (autoBatchDelay) {
Copy link
Owner

@queicherius queicherius Jul 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to also call this autoBatch like in the client, so it's consistent. Similar to language, debugging etc.

(Expanded on more below)

if (this._autoBatch === null) {
if (!this.isBulk) {
this.debugMessage(`${this.url} is not bulk expanding, endpoint will not have any autobatch behavior`)
return this
}
this._autoBatch = {
// autoBatchDelay: autoBatchDelay,
Copy link
Owner

@queicherius queicherius Jul 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove commented out code

idsForNextBatch: new Set(),
nextBatchPromise: null,
}
}
this.autoBatchDelay = autoBatchDelay || this.autoBatchDelay
return this
}

// Get all ids
ids () {
this.debugMessage(`ids(${this.url}) called`)
Expand Down Expand Up @@ -156,7 +176,14 @@ module.exports = class AbstractEndpoint {

// Request the single id if the endpoint a bulk endpoint
if (this.isBulk && !url) {
return this._request(`${this.url}?id=${id}`)
if (this._autoBatch === null) {
return this._request(`${this.url}?id=${id}`)
}
else {
return this._autoBatchMany([id]).then((items) => {
return items[0]?items[0]:null
})
}
}

// We are dealing with a custom url instead
Expand All @@ -168,8 +195,34 @@ module.exports = class AbstractEndpoint {
return this._request(this.url)
}

_autoBatchMany (ids) {
if (this._autoBatch.idsForNextBatch.size === 0) {
this._autoBatch.nextBatchPromise = new Promise((resolve, reject) => {
setTimeout(() => {
const batchedIds = Array.from(this._autoBatch.idsForNextBatch)
this.debugMessage(`autoBatchMany called (${batchedIds.length} ids)`)
this._autoBatch.idsForNextBatch.clear()
return resolve(this.many(batchedIds, true))
}, this.autoBatchDelay)
}).then(items => {
const indexedItems = {}
items.forEach(item => {
indexedItems[item.id] = item
})
return indexedItems
})
}

// Add the requested ids to the pending ids
ids.forEach(id => this._autoBatch.idsForNextBatch.add(id))
// Return the results based on the requested ids
return this._autoBatch.nextBatchPromise.then(indexedItems => {
return ids.map(id => indexedItems[id]).filter(x => x)
})
}

// Get multiple entries by ids
many (ids) {
many (ids, skipAutoBatch) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
many (ids, skipAutoBatch) {
many (ids, skipAutoBatch = false) {

this.debugMessage(`many(${this.url}) called (${ids.length} ids)`)

if (!this.isBulk) {
Expand All @@ -186,7 +239,7 @@ module.exports = class AbstractEndpoint {

// There is no cache time set, so always use the live data
if (!this.cacheTime) {
return this._many(ids)
return this._many(ids, undefined, skipAutoBatch)
}

// Get as much as possible out of the cache
Expand All @@ -201,7 +254,7 @@ module.exports = class AbstractEndpoint {

this.debugMessage(`many(${this.url}) resolving partially from cache (${cached.length} ids)`)
const missingIds = getMissingIds(ids, cached)
return this._many(missingIds, cached.length > 0).then(content => {
return this._many(missingIds, cached.length > 0, skipAutoBatch).then(content => {
const cacheContent = content.map(value => [this._cacheHash(value.id), value])
this._cacheSetMany(cacheContent)

Expand Down Expand Up @@ -230,9 +283,17 @@ module.exports = class AbstractEndpoint {
}

// Get multiple entries by ids from the live API
_many (ids, partialRequest = false) {
_many (ids, partialRequest = false, skipAutoBatch) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_many (ids, partialRequest = false, skipAutoBatch) {
_many (ids, partialRequest = false, skipAutoBatch = false) {

this.debugMessage(`many(${this.url}) requesting from api (${ids.length} ids)`)

if (this._autoBatch !== null) {
if (!skipAutoBatch) {
return this._autoBatchMany(ids)
}
}
Comment on lines +289 to +293
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (this._autoBatch !== null) {
if (!skipAutoBatch) {
return this._autoBatchMany(ids)
}
}
if (this._autoBatch !== null && !skipAutoBatch) {
return this._autoBatchMany(ids)
}




// Chunk the requests to the max page size
const pages = chunk(ids, this.maxPageSize)
const requests = pages.map(page => `${this.url}?ids=${page.join(',')}`)
Expand Down
71 changes: 71 additions & 0 deletions tests/autoBatchNode.spec.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
/* eslint-env jest */
const Client = require('../src/client')
const AutoBatchNode = require('../src/autoBatchNode')


describe('autoBatchNode', () => {
let client
let autoBatchNode
beforeEach(() => {
client = new Client()
autoBatchNode = new AutoBatchNode(client, client)
})

it(`calls the method of the wrapped client or endpoint`, () => {
let parentCalled = false
client.schema = () => {parentCalled = true}
autoBatchNode.schema()
expect(parentCalled).toEqual(true)
})


it(`method returns client (with changed setting)`, () => {
const resultNode = autoBatchNode.schema('new setting')
expect(resultNode === autoBatchNode).toEqual(true)
expect(autoBatchNode.parent.schemaVersion).toEqual('new setting')
})

describe(`method returns a new endpoint`, () => {
let itemsNode
beforeEach(() => {
itemsNode = autoBatchNode.items()
})

it(`is a new instance of AutoBatchNode`,() => {
expect(itemsNode).toBeInstanceOf(AutoBatchNode)
expect(itemsNode === autoBatchNode).toEqual(false)
})

it(`all method calls sync settings from client, and force caching`, () => {
client.schema('a')
.language('b')
.authenticate('c')
.debugging('d')

itemsNode.live()
itemsNode.debugMessage('syncs settings with client TEST')
expect(itemsNode.parent.schemaVersion).toEqual('a')
expect(itemsNode.parent.lang).toEqual('b')
expect(itemsNode.parent.apiKey).toEqual('c')
expect(itemsNode.parent.debug).toEqual('d')
expect(itemsNode.parent._skipCache).toEqual(false)
})

it(`saves child endpoints for later calls`, () => {
let charNode1 = autoBatchNode.characters('1')
let charNode2 = autoBatchNode.characters('2')

expect(charNode1 === charNode2).toEqual(false)
expect(autoBatchNode.characters('1') === charNode1).toEqual(true)
expect(autoBatchNode.characters('2') === charNode2).toEqual(true)
expect(autoBatchNode.items() === itemsNode).toEqual(true)
expect(Object.keys(autoBatchNode.children).length).toEqual(3)
})

it(`turns on autoBatching for new endpoint`, () => {
expect(itemsNode.parent._autoBatch).not.toBeNull()
})

})

})
28 changes: 28 additions & 0 deletions tests/client.spec.js
Original file line number Diff line number Diff line change
@@ -1,9 +1,21 @@
/* eslint-env jest */
const nullCache = require('../src/cache/null')
const memoryCache = require('../src/cache/memory')
const AutoBatchNode = require('../src/autoBatchNode')
const Module = require('../src/client')
const wait = (ms) => new Promise((resolve) => setTimeout(resolve, ms))

// async function expectError (callback) {
// let err
// try {
// await callback()
// } catch (e) {
// err = e
// }

// expect(err).toBeInstanceOf(Error)
// }

describe('client', () => {
let client
beforeEach(() => {
Expand Down Expand Up @@ -123,6 +135,22 @@ describe('client', () => {
client.build = tmp
})

describe('autobatch', () => {

it(`can get the autoBatchNode wrapped Client`, () => {
expect(client.autoBatch()).toBeInstanceOf(AutoBatchNode)
expect(client.autoBatch()).toBeInstanceOf(AutoBatchNode)
expect(client.autoBatch().parent).toBeInstanceOf(Module)
expect(client.autoBatch().parent === client).toEqual(true)
})

it('can get an autobatching endpoint', () => {
let endpoint = client.autoBatch().items().parent
expect(endpoint.url).toEqual('/v2/items')
expect(endpoint._autoBatch).not.toBeNull()
})
})

it('can get the account endpoint', () => {
let endpoint = client.account()
expect(endpoint.url).toEqual('/v2/account')
Expand Down
Loading