Multi-lingual Question Generating Model (mt5-base)

Give the model a passage and it will generate a question about the passage.

Trained on the following datasets:

Training details

I used flax summarization script and a TPU v3-8. Summarization expects a text column and a summary column. For question generation training, use the context column instead of text column and question instead of summary column.

There is no guarantee that it will produce a question in the language of the passage, but it usually does. Lower resource languages will likely have lower quality questions.

Using the model

PyTorch version

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
  
tokenizer = AutoTokenizer.from_pretrained("nbroad/mt5-base-qgen")
model = AutoModelForSeq2SeqLM.from_pretrained("nbroad/mt5-base-qgen")

text = "Hugging Face has seen rapid growth in its \
popularity since the get-go. It is definitely doing\
 the right things to attract more and more people to \
 its platform, some of which are on the following lines:\
Community driven approach through large open source repositories \
along with paid services. Helps to build a network of like-minded\
 people passionate about open source. \
Attractive price point. The subscription-based features, e.g.: \
Inference based API, starts at a price of $9/month.\
"

inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs, max_length=40)

tokenizer.decode(output[0], skip_special_tokens=True)
# What is Hugging Face's price point?

Model trained on Cloud TPUs from Google's TPU Research Cloud (TRC)

Downloads last month: 13

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

nbroad
/

mt5-base-qgen

Multi-lingual Question Generating Model (mt5-base)

Trained on the following datasets:

Training details

Using the model

PyTorch version

Datasets used to train nbroad/mt5-base-qgen