Skip to content

rust/language.rs: add full BCP 47 compliant pipeline #16051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

kasper93
Copy link
Member

  • Removes custom, incomplete language mapping tables
  • Adds validation for the BCP 47-defined format
  • Compares all components separately instead of treating them as the
    same entity
  • Adds canonicalization support for ISO 639-{1,2t,2b,3}

It is not overly strict in validation to preserve compatibility with
"custom" tags. If the language can be parsed and normalized, this will
be used; otherwise, it falls back to direct string comparison.

For BCP 47 components, all mismatches are penalized equally, with 1000
points deducted per mismatch. This value can be adjusted if needed.

This commit should greatly improve support for matching language tags
consisting of more than just the primary language and region.

Disclaimer: Don't worry it is only a draft. No one will take away your precious C mpv.

@Andarwinux
Copy link
Contributor

Is it possible to avoid dependencies on cargo/crates?
This would introduce a build system and dependency ecosystem from an "Another World", and personally, there's nothing objectionable about Rust beyond that.

I'd much rather have these things available as in-tree C plugins that can be built independently, which would avoid most of the various compatibility issues between rust/cargo and the C toolchain.

@kasper93
Copy link
Member Author

Is it possible to avoid dependencies on cargo/crates? This would introduce a build system and dependency ecosystem from an "Another World", and personally, there's nothing objectionable about Rust beyond that.

Using third-party modules easily is one of the strengths here. We don't have to reinvent the wheel for everything, and we don't have to add unnecessary external dependencies, run-time deps.

I'd much rather have these things available as in-tree C plugins that can be built independently, which would avoid most of the various compatibility issues between rust/cargo and the C toolchain.

There is no Cargo involved here. Everything is built by Meson and linked statically as needed. Meson handles everything, there is no "ecosystem from Another World," except for the subproject that downloads the source code from a crate. It works well, except for some issues on Windows. (hopefully linker issues will be resolved on Meson side mesonbuild/meson#14366, this is not even rust related anymore, as we link object files as we would from C files.)

The point here is that there is no difference between using a Rust subproject and, for example, using FFmpeg or libplacebo as a subproject. Which we already do for some of our CI jobs. The build system is still fully Meson, the only difference is that a different compiler is invoked for some files.

@mahkoh
Copy link
Contributor

mahkoh commented Mar 15, 2025

I'm not an mpv developer but I sometimes work with its code.

Adopting rust via meson would be a downgrade in developer experience for me.

I use clion for C development and rustrover for rust development. Currently, all of the usual IDE features work about as well as you can expect from C code: I can jump to definition, find usages, show the call hierarchy, etc. If mpv were to use rust via meson (rather, anything other than cargo), this would stop working for that rust code, because rustrover (and the equivalent rust plugin for clion) do not support rust via meson. The only way to access such IDE features in these IDEs is by having a cargo manifest that the IDEs can use to fetch metadata from.

You are probably already aware, but if you want to see examples of how some other projects integrate cargo with meson you might want to look at various gnome projects and qemu. I assume they just use meson to invoke cargo as an external command to produce a static library.

@kasper93
Copy link
Member Author

kasper93 commented Mar 15, 2025

If mpv were to use rust via meson (rather, anything other than cargo), this would stop working for that rust code, because rustrover (and the equivalent rust plugin for clion) do not support rust via meson. The only way to access such IDE features in these IDEs is by having a cargo manifest that the IDEs can use to fetch metadata from.

I see where you're coming from, but it may not be as big of an issue.

Meson generates rust-project.json (similar to compile_commands.json for C) during build that can be used by the IDEs to configure their language support. See https://mesonbuild.com/Rust.html#use-with-rustanalyzer rust-analyzer works well with this. Bazel also can generate rust-project.json file for tooling to use. I don't use RustRover, but it looks like they indeed do not support external build systems, as evident from those issues https://youtrack.jetbrains.com/issue/RUST-3730, but I don't know the details and how much limitations this imposes, some features are likely to be working still, depends how their language server is implemented.

You are probably already aware, but if you want to see examples of how some other projects integrate cargo with meson you might want to look at various gnome projects and qemu. I assume they just use meson to invoke cargo as an external command to produce a static library.

Both Mesa and QEMU integrates Rust in the same way as this PR does. I find it clean and lightweight. And most importantly transparent to existing meson workflows. Except of course added rustc dependency.

If you have better examples of integrating Rust into C project, feel free to share for me to look at.

@mahkoh
Copy link
Contributor

mahkoh commented Mar 15, 2025

Meson generates rust-project.json (similar to compile_commands.json for C) during build that can be used by the IDEs to configure their language support.

This is not supported by clion or rustrover as you say.

some features are likely to be working still, depends how their language server is implemented.

They do not.

Both Mesa and QEMU integrates Rust in the same way as this PR does.

Yes, mesa has all of the problems I mentioned above.

qemu uses Cargo.toml files which provide IDE integration.

@kasper93
Copy link
Member Author

qemu uses Cargo.toml files which provide IDE integration.

I see. I was looking at top level directory only, but it was not there. I will fix that. Thanks for bringing this up.

@kasper93 kasper93 changed the title misc/language.rs: add full BCP 47 compliant pipeline rust/language.rs: add full BCP 47 compliant pipeline Mar 15, 2025
@kasper93
Copy link
Member Author

@mahkoh: Updated the structure now, should work fine with RustRover or anything else really.

@mahkoh
Copy link
Contributor

mahkoh commented Mar 15, 2025

Works for me, thanks.

- Removes custom, incomplete language mapping tables
- Adds validation for the BCP 47-defined format
- Compares all components separately instead of treating them as the
  same entity
- Adds canonicalization support for ISO 639-{1,2t,2b,3}

It is not overly strict in validation to preserve compatibility with
"custom" tags. If the language can be parsed and normalized, this will
be used; otherwise, it falls back to direct string comparison.

For BCP 47 components, all mismatches are penalized equally, with 1000
points deducted per mismatch. This value can be adjusted if needed.

This commit should greatly improve support for matching language tags
consisting of more than just the primary language and region.
Copy link

@kasper93 kasper93 marked this pull request as ready for review April 16, 2025 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants