VizSeq

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation and video description. It takes multi-modal sources, text references as well as text predictions as inputs, and analyzes them visually in Jupyter Notebook or a built-in Web App (the former has Fairseq integration). VizSeq also provides a collection of multi-process scorers as a normal Python package.

Please also see the paper https://arxiv.org/pdf/1909.05424.pdf for more details.

Task Coverage

VizSeq accepts various source types, including text, image, audio, video or any combination of them. This covers a wide range of text generation tasks, examples of which are listed below:

Source

Example Tasks

Text

Machine translation, text summarization, dialog generation, grammatical error correction, open-domain question answering

Image

Image captioning, image question answering, optical character recognition

Audio

Speech recognition, speech translation

Video

Video description

Multimodal

Multimodal machine translation

Metric Coverage

Accelerated with multi-processing/multi-threading.

Type

Metrics

N-gram-based

Embedding-based

Add metric

VizSeq has an open API for adding user-defined metrics. You are welcomed to contribute new scorers to enlarge VizSeq’s metric coverage!

Implementing A New Scorer Class

To start with, first add new_metric.py to vizseq/scorers, in which a new scorer class is inherited from VizSeqScorer and a score method is defined. And then register the new scorer class with an id and a name using vizseq.scorers.register_scorer:

from typing import Optional, List
from vizseq.scorers import register_scorer, VizSeqScorer, VizSeqScore

@register_scorer('new_metric_id', 'New Metric Name')
class NewMetricScorer(VizSeqScorer):
   def score(
           self, hypothesis: List[str], references: List[List[str]],
           tags: Optional[List[List[str]]] = None
   ) -> VizSeqScore:
       # calculate the number of workers by number of examples
       self._update_n_workers(len(hypothesis))

       corpus_score, group_scores, sent_scores = None, None, None

       if self.corpus_level:
           # implement corpus-level score
           corpus_score = 99.9
       if self.sent_level:
           # implement sentence-level score
           sent_scores=[99.9, 99.9]
       if tags is not None:
           tag_set = self._unique(tags)
           # implement group-level (by sentence tags) score
           group_scores={t: 99.9 for t in tag_set}

       return VizSeqScore.make(
           corpus_score=corpus_score, sent_scores=sent_scores,
           group_scores=group_scores
       )

Testing the New Scorer Class

All the scorer classes need to be covered by tests. To achieve that, Add a unit test test_new_metric.py to tests/scorers and run:

python -m unittest tests.scorers.test_new_metric

License

VizSeq is licensed under MIT.