https://delta.io logo
m

Matthew Powers

03/31/2023, 12:29 PM
I tried out the mdBook tool that’s written in Rust and was really impressed. It’s easy to use and generates SEO friendly URLs. mdBook is used to create the Polars User Guide which is beautiful. I think the amazing Polars User Guide is part of the reason that project is so successful. Here are our current docs: https://delta-io.github.io/delta-rs/python/index.html The URLs aren’t SEO friendly. Here’s an example: https://delta-io.github.io/delta-rs/python/usage.html#querying-delta-tables. Google ignores anchor tags for SEO. This would be a better url: https://delta-io.github.io/delta-rs/python/usage/querying-delta-tables. I’m also not sure how to update our docs currently. Would folks be open to migrating to mdBook? I am happy to do all the work. Migrating to a Rust solution also seems aligned with the delta-rs philosophy 🤓
👍 2
w

Will Jones

03/31/2023, 2:29 PM
I like mdbook for user guides, but I think there are some real benefits to using Sphinx for Python projects: • It automatically builds our API reference from the doc strings of our functions. • It provides a nice way to link between the user guide prose and individual functions/classes in the API reference. • And it even does a good job interlinking to other API references. (For example, the
get_add_actions()
docs return type linked to the docs for
pyarrow.lib.RecordBatch
.)
👍 1
Perhaps we can do some work to adjust the layout of the site to be more SEO friendly, while keeping it as sphinx? I’d also like to make sure we don’t break links if possible.
m

Matthew Powers

03/31/2023, 2:49 PM
Yea, I think Polars uses Sphinx for the API docs as well: https://pola-rs.github.io/polars/py-polars/html/reference/
For the links, these to pages are indistinguishable from Google’s perspective: • https://delta-io.github.io/delta-rs/python/usage.html#examining-a-tablehttps://delta-io.github.io/delta-rs/python/usage.html#managing-delta-tableshttps://delta-io.github.io/delta-rs/python/usage.html#writing-delta-tables The entirety of the docs is just one https://delta-io.github.io/delta-rs/python/usage.html page from Google’s perspective. That will provide users with a bad SEO experience unless we change the URLs unfortunately. Do the “delta-rs python writing delta tables” search and see that the right page doesn’t show up in the rankings.
j

Jim Hibbard

03/31/2023, 4:05 PM
I think Sphinx only bundles everything under
usage
if you're using the autodoc feature(s). You could construct a different doc tree by manually writing the rst directives and then using automodule/autofunction etc. to document the individual pieces of the API in each of those smaller docs. It's less automagical but you could get separate URLs that way if that's the main problem.
But to Will's point, if we do that or use mdBook, it sounds like we'll have to make a bunch of URL redirects so that people linking to the current docs don't get 404's. So it'll probably be a bit of work either way unfortunately.
m

Matthew Powers

03/31/2023, 5:15 PM
There are currently three pages in the docs: • https://delta-io.github.io/delta-rs/python/installation.html#https://delta-io.github.io/delta-rs/python/usage.htmlhttps://delta-io.github.io/delta-rs/python/api_reference.html If we migrate to any other technology, then we can keep all these pages intact, so no redirects needed. There is no such thing as redirects with anchor tags. And yes, we can reconfigure Sphinx docs to have SEO/user friendly URLs, like the Dask docs: https://docs.dask.org/en/stable/
j

Jim Hibbard

03/31/2023, 6:34 PM
You're correct about not being able to redirect an anchor tag, I was thinking in the context of react-router where you could handle the redirect on the frontend, but Sphinx is naturally not using React. Good call out.
r

rtyler

04/03/2023, 7:20 PM
can we have API docs and move everything else into mdbook?
j

Jim Hibbard

04/03/2023, 8:16 PM
That should be doable, would we want the mdbook docs in another repo then? It could be a separate site too, they could just link to each other.
👍 1
m

Matthew Powers

04/03/2023, 10:35 PM
@rtyler - yea, I like having API docs and moving everything else into mdbook. Here’s the Polars Book Github repo. It’s so nice and easy to navigate.
👍 1
j

Jim Hibbard

04/03/2023, 11:21 PM
@Matthew Powers Their docs repo looks pretty simple, was anything difficult to configure during your testing or was it all straightforward?
m

Matthew Powers

04/03/2023, 11:23 PM
@Jim Hibbard - it was all straightforward. mdBook is just an executable that you add to your path. I love it.
j

Jim Hibbard

04/03/2023, 11:24 PM
Sounds like a winner then, I'll check it out too. I need to make some docs anyways 👍
👏 1
w

Will Jones

04/07/2023, 6:50 PM
IMO I don’t think there’s much we gain in features by switching part (or all) of our docs to mdBook. I think we can just as easily improve our docs while keeping with Sphinx. But if switching to mdbook makes you more motivated to contribute to the docs, then it seems like a win to me 👍
m

Matthew Powers

04/08/2023, 8:41 PM
I’ve worked on Sphinx documentation sites for large open source projects that get lots of traffic and know they can be configured alright. Here are the advantages of mdBook over Sphinx (for the “user guide” type content): • Properly structured URLs out of the box. If we stick with Sphinx, we need to fix the URLs, so they can be properly indexed and ranked • Better out-of-the-box themes. We can make Sphinx look nice, but it’s harder. • Markdown instead of RST. The ultimate goal is beautiful, amazing documentation. Both tools can achieve the goal. I think it’s going to be a lot easier with mdBook.
👍 1
w

Will Jones

04/08/2023, 9:24 PM
Sounds good. I’m excited to see what you come up with then 👍
j

Jim Hibbard

04/09/2023, 1:44 AM
Sounds good, the size of delta-rs's docs is still pretty small too. So not the worse time to be experimenting either.
👍 2
4 Views