In the off-chance that anyone other than me is reading this, it’s likely that there will be no update next week due to the Easter weekend. It may not be the only weekend without an update either. If I find that I didn’t get much done for a particular week, I probably won’t say anything and leave the well to fill up for the next one (although I do have some topics planned for some of those weekends).
In fact, I was not expecting to say much this week, given that work was going through a bit of a crunch period. But the needs of work finally pushed me to add a few features to Dynamo-Browse that were sorely lacking. So that’s where most of this week’s side-project budget went to.
A lot of work done to query expressions in Dynamo-Browse this week, some of it touching on a few topics I mentioned in a previous update.
I’ve finally got around to finishing the
between keyword, so that it works with the query planner and actually produces a DynamoDB query when used with a range[^sort] key. This means no more falling back on table scans. It’s still in a branch as of this post, but I feel much less embarrassed with merging it now, given that this support has been added.
I’ve also made a decision about how to deal with multiple index candidates. Now, when Dynamo-Browse finds that multiple indices can apply for a specific query expression, it will produce an error, requesting you to specify which index to use. This can be done by adding a
using suffix to an expression, which specifies how the query should be evaluated:
color="blue" using index("color-item-index")
This can be used at any time to override the index Dynamo-Browse should use, even if only one index candidate was found. It can also be used to force the query to run as a table scan if you don’t want to use an index at all:
color="blue" using scan
Ideally you shouldn’t need to use this suffix that often. The whole purpose of query expressions was to eliminate the need for specifying details of how the query should be evaluated. But we don’t live in a perfect world, and it makes sense adding this to deal with cases where it helps to be specific.
This is also in a branch, but I’m hoping this would be merged soon as well.
Unified Expression Types and Values
A relatively large change made this week was how how values and types are represented within query expressions.
A query expression, once parsed, can be executed in multiple contexts. It can be used to generate a conditional expression for a DynamoDB query or scan, or it can be evaluated within the app itself to produce a result or alter the fields of a DynamoDB record in memory. Each of these contexts have a different set of types the expression operates on. When interpreting the expression in order to produce a result, the expression operates on types that implement types.AttributeType. This fits nicely with what the expression has to work with, which is usually the raw DynamoDB records returned by the Go client. The context used to produces conditional expressions, however, operate on a sort of hybrid type hieararchy, that supports both
AttributeType and Go types. This is because to the client used to build the expression accept native Go values, which are sometimes available — particularly if they show up in the expression as a literal — but sometimes not.
But here’s the problem: I want to be able to add functions to the expression language that can be used in both contexts. I’ll get into what sort of functions I’m thinking of in a minute, but the issue is that with two sets of type hierarchies, I’d have to implement the functions twice.
Another problem is that an evaluation context operating on AttributeTypes feels very inefficient. Numbers are represented as string, and new attribute values are created on the heap. This is probably not too bad in the grand scheme of things, but it would be nice to use native Go values here, even if it’s just to avoid going from strings to numbers constantly.
So I spent most of yesterday trying to fix this. I built a new private Go interface called
exprValue and added as implementing subtypes all the types supported by DynamoDB — strings, numbers, booleans, lists, etc. Values of these type implement this new interface, and can be converted to Go values or DynamoDB
AttributeType values depending on the need.
Most of the evaluation logic was changed to use these types, including the builtin functions, and already I’m seeing some dramatic improvements of what’s possible now. I can define a function once and it can be evaluated both in the evaluation and query building context (provided that it’s only operating on constant values in the query building context). It also addressed some long standing issues I’ve had with the expression language, such as adding support for using a list with the
in keyword; something that was not possible before:
pk in $someList
This could potentially be helpful with the “fan-out” one I mentioned a few weeks ago.
This is still early days, but I think it’s been a huge improvement to what was there before. And it’s super satisfying cleaning out all this tech-debt, especially if it means I can add features easily now.
Date Functions In The Expression Language
Now with the new type hierarchy in place, the time has come to start adding functions. What existed to date were the operators and functions that DynamoDB’s conditional expression language supported, and little else. It’s time to go beyond that. And to be honest, this was always the plan, especially given that operators like “begins with” (
^=) have been there since the start.
This first thing I’m pondering now is time and date functions. The immediate issue is one of representation: in that I don’t want to settle on any specific one. I’ve seen dates stored as both string date-stamps, usually in ISO 8601, or as integer seconds from the Unix epoch, and it would be good to operate on both of these, in addition to other possible representations, like milliseconds from the Unix epoch, to some other string encoding scheme.
So what I’m thinking is an abstract date-type, probably something backed by Go’s builtin date.Time type. This will neither be a number or a string, but can be converted to one, maybe by using a keyword like
now() as "S" -- represent date as ISO-8601 timestamp now() as "N" -- represent date as Unix timestamp now() -- error: need to convert it to something
Or maybe some other mechanism.
The idea is that all the builtin functions will operate on this type, but will prevent the user from assuming a particular representation, and will force them to choose one.
I think this is something that will fit nicely with the new type hierarchy system, but for now (pun unintended), I’ll stick with Unix timestamp, just so that I can use something that is easy to implement. But to make it crystal clear that this is temporary, any such functions will have an annoying prefix.
So two new functions were added this week: the
_x_now() function, which returns the current time as seconds from the Unix epoch as a number; and the
_x_add(), which returns the sum of two numbers. Much like the time functions, I’d like to eventually add arithmetic operators like
+ to the expression language, but I needed something now and I didn’t have much time to work on that.
Finally, a few random notes about commands dealing with attribute values.
set-attr command can now accept the switch
-to, which can be used to set the attribute to the result of a query expression. No more copying-and-pasting values, and operating on them outside Dynamo-Browse.
The good thing about this is that the previous attribute values are available in the value expression, so you can use this switch to set the value of attribute based on other attributes in a row. This comes in super handy with bulk changes. I’ve used this to adjust the value of TTLs in a table I’m working in. To set the TTL to be 10 minutes into the future, I just marked the rows, entered the command
set-attr -to ttl, and use the expression
_x_add(_x_now(), 600). Super useful.
Also, I’ve found a bug where the
del-attr command does not work with marked items. It’ll only delete attributes from the item that’s selected (i.e. in pink). I haven’t got around to fixing this, but I hope to very soon.
I think that’s all for this week. Until next time.