Assistance Required: Resolving S3AFileSystem ClassNotFoundException in Trino #25640
Unanswered
kran46aditya
asked this question in
Q&A
Replies: 1 comment
-
First, if your data is on s3, the parameter fs.native-s3.enabled should be set to true. Second, do you get the error in the trino log or hive log? There should to my knowledge not be neccessary to copy any files if using official image. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We are trying to setup trino with minIO and have encountered a persistent issue with our Trino configuration. Below is a detailed overview of our setup, the error we are facing, and the steps we have taken to address it.
Our setup uses the following components:
Trino (version 474) as the query engine, using the official Docker image trinodb/trino:474. Spark for data ingestion. MinIO (latest stable version) as the S3-compatible storage layer, configured at http://minio:9000. Hive Metastore (version 3.1.3, backed by MySQL, Dockerized) for cataloging Apache Hudi tables. Spark jobs write Hudi tables to MinIO using s3a:// URIs and sync them to the Hive Metastore. Trino is configured to access these Hudi tables via the Hive connector, with Hadoop 3.2.4 providing the necessary dependencies.
Trino's Hive connector is configured in hive.properties as follows:
hive.metastore.uri=thrift://hive-metastore:9083 fs.native-s3.enabled=false hive.s3.aws-access-key=minioadmin hive.s3.aws-secret-key=minioadmin hive.s3.endpoint=http://minio:9000 hive.s3.path-style-access=true hive.s3.ssl.enabled=false
Additionally, we have ensured that all required configurations are included in the Hadoop hive-site.xml file, including the fs.s3a.impl property set to org.apache.hadoop.fs.s3a.S3AFileSystem, along with other necessary S3A configurations for MinIO compatibility.
The issue arises when querying Hudi tables or creating external tables in Trino using s3a:// URIs. We receive the following error:
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
To resolve this, we have taken the following steps:
Copied the required JARs (hadoop-aws-3.2.4.jar, hadoop-client-api-3.2.4.jar, hadoop-client-runtime-3.2.4.jar, and aws-java-sdk-bundle-1.11.375.jar) into /usr/lib/trino/plugin/hive/ inside the Trino container. Verified the presence of these JARs in the container using the command ls /usr/lib/trino/plugin/hive/, which lists all four JARs. Built a custom Trino Docker image with the following Dockerfile:
FROM trinodb/trino:474 USER root COPY jars/*.jar /usr/lib/trino/plugin/hive/
Rebooted the Trino container and confirmed that it starts without issues. Despite these efforts, the error persists, indicating that Trino is not loading the org.apache.hadoop.fs.s3a.S3AFileSystem class.
We would greatly appreciate your expertise in identifying the root cause of this issue and suggesting a solution. Please let us know if additional logs, configurations, or details about our environment would be helpful.
Thank you for your assistance, and we look forward to your guidance.
Beta Was this translation helpful? Give feedback.
All reactions