Inputfile SMILES ERROR #14

Tiger2Wings · 2025-03-19T11:09:04Z

Hi, excellent job!

We have a problem.
When we run the cmd,
an error:
ERROR (2, 10) [21, 22, 23, 24] [24NH2][21CH2][22CH2][23OH]
What does this line mean?
we believe that one SMILES (C1=CC(=C(C=C1N+[O-])Cl)NC(=O)C2=C(C=CC(=C2)Cl)O.C(CO)N) in the Inputfile is wrong.

So, could you please modify this program so that it can automatically skip problematic SMILES and save them to error_smiles.csv?
It would help a lot.
Thank U!

The text was updated successfully, but these errors were encountered:

ch4perone · 2025-03-28T09:48:14Z

Hi thanks for the remark,
which version are you using? In the latest release v0.1.2 I implemented skipping input reading errors with SMILES strings, and the --debug option prints out SMILES that cause problems.
Can you confirm that you still encounter this error in v0.1.2 and could you give me the cml output (in debug mode), so I can see at which point it crashes?

Tiger2Wings · 2025-04-19T09:53:35Z

md5sum value of fiora-main/README.md is "c4e14e6815450ce09005264295aef554", does it mean the version is 0.1.2?

Instrument_type: ["HCD", "Q-TOF", "IT-FT/ion trap with FTMS", "IT/ion trap"], are other types available?
If the Instrument_type is "ABSCIEX", it does not belong one of ["HCD", "Q-TOF", "IT-FT/ion trap with FTMS", "IT/ion trap"], the cmd "fiora-predict" still works, so what is the default Instrument_type when the program runs?

the Input file is attached as error1.csv
error1.csv

cmd "fiora-predict -i error1.csv -o e1.mgf --debug", how can it detect invalid smiles?
the cmd shows:

Running` fiora prediction with the following parameters: Namespace(input='error1.csv', output='e1.mgf', model='default', dev='cpu', min_prob=0.001, rt=False, ccs=False, annotation=False, debug=True)

-----Model-----
Fiora OS v0.1.0
---------------

Disclaimer: No prediction software is perfect. This is an early open-source model. Use with caution.
ERROR (0, 1) [22] [22Cl-]
Traceback (most recent call last):
  File "/home/ubuntu/.conda/envs/fiora/bin/fiora-predict", line 196, in <module>
    main()
  File "/home/ubuntu/.conda/envs/fiora/bin/fiora-predict", line 179, in main
    df, invalid_df = build_metabolites(df, model.model_params)
  File "/home/ubuntu/.conda/envs/fiora/bin/fiora-predict", line 105, in build_metabolites
    df["Metabolite"].apply(lambda x: x.fragment_MOL(depth=1))
  File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/series.py", line 4924, in apply
    ).apply()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/apply.py", line 1427, in apply
    return self.apply_standard()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/apply.py", line 1507, in apply_standard
    mapped = obj._map_values(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/base.py", line 921, in _map_values
    return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/algorithms.py", line 1743, in map_array
    return lib.map_infer(values, mapper, convert=convert)
  File "lib.pyx", line 2972, in pandas._libs.lib.map_infer
  File "/home/ubuntu/.conda/envs/fiora/bin/fiora-predict", line 105, in <lambda>
    df["Metabolite"].apply(lambda x: x.fragment_MOL(depth=1))
  File "/home/ubuntu/.conda/envs/fiora/lib/python3.10/site-packages/fiora/MOL/Metabolite.py", line 210, in fragment_MOL
    self.fragmentation_tree.build_fragmentation_tree(self.MOL, self.edges_as_tuples, depth=depth)
  File "/home/ubuntu/.conda/envs/fiora/lib/python3.10/site-packages/fiora/MOL/FragmentationTree.py", line 132, in build_fragmentation_tree
    _, fragments = self.create_Fragments(mol, i, j, original_mol_isotopes=mol_isotopes)
  File "/home/ubuntu/.conda/envs/fiora/lib/python3.10/site-packages/fiora/MOL/FragmentationTree.py", line 173, in create_Fragments
    return new_mol, [Fragment(m, edge=(int(i), int(j)), isotope_labels=original_mol_isotopes) for m in fragment_mols]
  File "/home/ubuntu/.conda/envs/fiora/lib/python3.10/site-packages/fiora/MOL/FragmentationTree.py", line 173, in <listcomp>
    return new_mol, [Fragment(m, edge=(int(i), int(j)), isotope_labels=original_mol_isotopes) for m in fragment_mols]
  File "/home/ubuntu/.conda/envs/fiora/lib/python3.10/site-packages/fiora/MOL/FragmentationTree.py", line 27, in __init__
    raise ValueError("Unidentified edge in fragment")
ValueError: Unidentified edge in fragment

ch4perone · 2025-04-22T09:22:14Z

I think the program has problems reading the "." symbol in the SMILES, though the error occurs much later when fragmenting the molecule. I will work on a fix soon. For now, I recommend removing every SMILES with with a dot "." from the csv file. I hope this already helps.

Regarding the instrument type. Yours will be automatically flagged as "Others" instrument type for model input. I recommend using "HCD" instead, since the OS model predominantly trained on Orbitrap data. You should yield better results, even if its not technically correct.

Tiger2Wings · 2025-04-23T12:44:16Z

I think the program has problems reading the "." symbol in the SMILES, though the error occurs much later when fragmenting the molecule. I will work on a fix soon. For now, I recommend removing every SMILES with with a dot "." from the csv file. I hope this already helps.

Regarding the instrument type. Yours will be automatically flagged as "Others" instrument type for model input. I recommend using "HCD" instead, since the OS model predominantly trained on Orbitrap data. You should yield better results, even if its not technically correct.

i have just written a script that can censor the smiles and make predictions.
fiora_check1.py
(File type not allowed: .py. So you can just convert the uploaded .txt to .py)

It looks like that Error occurs when a molecule isn NOT single_connected.

def is_single_connected(smiles):
    mol = Chem.MolFromSmiles(smiles)
    return len(Chem.GetMolFrags(mol, asMols=True)) == 1 if mol else False

fiora_check1.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inputfile SMILES ERROR #14

Inputfile SMILES ERROR #14

Tiger2Wings commented Mar 19, 2025 •

edited

Loading

ch4perone commented Mar 28, 2025

Tiger2Wings commented Apr 19, 2025 •

edited

Loading

ch4perone commented Apr 22, 2025 •

edited

Loading

Tiger2Wings commented Apr 23, 2025 •

edited

Loading

Inputfile SMILES ERROR #14

Inputfile SMILES ERROR #14

Comments

Tiger2Wings commented Mar 19, 2025 • edited Loading

ch4perone commented Mar 28, 2025

Tiger2Wings commented Apr 19, 2025 • edited Loading

ch4perone commented Apr 22, 2025 • edited Loading

Tiger2Wings commented Apr 23, 2025 • edited Loading

Tiger2Wings commented Mar 19, 2025 •

edited

Loading

Tiger2Wings commented Apr 19, 2025 •

edited

Loading

ch4perone commented Apr 22, 2025 •

edited

Loading

Tiger2Wings commented Apr 23, 2025 •

edited

Loading