# tlohde

progressing with a project

23:17 14/02/2026
891 words
contents

Am I doing a poor imiation of what garmin and strava already do? Yes. Am I learning. Yes. Am I having fun. Mostly yes. I think. I should hope so.

garth

In my previous post on the matter, I mentioned that my silly little CLI would not have the the ability to auto-magically sync. And I was wrong.

Having dug into what garmindb does, I discovered garth. garth makes it super easy to grab and download data from garmin connect. Like ~one line easy: garth.Activity.list(). That's the bulk of it. Followed by some magic with zipfile.ZipFile() and io.BytesIO().[1]

So, I can now download new activities from garmin to my ✨database✨.

magic.

duckdb

I am getting more comforable with SQL and duckdb.[2]

Bulk-importing all my old .gpx files was reasonably straightforward; ditto the .fit files I downloaded from garmin. Having one .db file to backup feels a lot cleaner than the ~2.5k that I've been struggling and failing to organise for a few years. With the added bonus of the .db being much smaller, and it can be made even smaller if I dump the tables[3] out to .parquet files.

I don't think I'm dealing with enough data to really benefit from duckdb over pandas. I could have skipped the database step altogether:

I suppose I would then lose the security that maybe comes with a database, and the ability to enforce keys exist in other tables...

As for speeed, again, I don't think I'm dealing with enough data. My points table with all the individual track points has 4.39 million rows, so maybe...

Once I've finished, I might[4] run a few speed comparisons. Since apparently that's a trendy thing to do.[5]

And most of the things I want this tool to be able to do sit under the "query" umbrella...

typer

I'm happy with my choice of CLI-maker-tool. This coupled with poetry have been mercifully straightforward to use.

And I now have a handful of commands that do things. These are invoked like so: trak import-bulk or trak cumulative-distance.

I've been diligently adding docstrings (not the best, but not nothing) to everything, which means I am also able to append --help and I get something that looks like:


$ trak filterd-date --help

Usage: trak filter-date [OPTIONS] START STOP

filtering tracks by date. returns track_ids NOTE: if input is yyyy defaults to start of year. to get full year (i.e. all tracks in 2025) input: date-filter 2025 2026

Arguments

start TEXT date string for start of time period: yyyy; mm-yyyy; dd-mm-yyyy. seperators can be any of the following: /,.: required
stop TEXT date string for end of time period: yyyy; mm-yyyy; dd-mm-yyyy. seperators can be any of the following: /,.: required

importers

For bringing data into the database. This is where the wrangling happens.

summaries

For creating tables of summary statistics & showing distance travelled by activity type for each year, along with a some graphs.

Many more of these to write...

filters

For searching the database, by time or location.

filter-location is a bit messy, but it works either by supplying a (lat, lon) and a radius, or a bounding box.[6] I think[7] filtering by location is where duckdb will outperform geopandas. Maybe. But again, I'm not dealing with HUGE data.

plotting

uniplot is cool. I wish subplots were a thing. And I suppose they could be, by writing to a file, then sort of interleaving them...

misc

Some of my old .gpx files have some pretty funky characteristcs. There's a few[8] with negative time steps in them.

There are many instances of a single .gpx file holding two short rides, separated by a week or more, that end up greatly exaggerating the duration.

I'm manually dealing with these issues, as that is likely the quicker thing to do.

Stop detection

It is a tricky beast. I'm using movingpandas and my current definition of stopped is: spending two minutes within a 50 m radius. Typing that there, it feels wrong. But it's close enough. Ish. There are some instances, where it doesn't clock a stop that is[9] & some where it registers a stop that isn't.[10]

Who cares? I'm having fun.

footnotes


  1. thank you stackoverflow (probably) ↩︎

  2. both the duckdb CLI and the python API ↩︎

  3. currently just two ↩︎

  4. probably won't ↩︎

  5. there are many posts dedicated to this, and many of them are a bit sloppy, if you catch my drift ↩︎

  6. I was trying to see if I could use mapscii and then pipe in the bounding box from that... ↩︎

  7. based on little more than a weak hunch ↩︎

  8. 19 ↩︎

  9. like turning off the recording device whilst having a picnic, turning it back on, and moving > 50 m before signal is reacquired ↩︎

  10. pushing a heavy bike up a hill ↩︎