Self-host a multilabel archive

Cometa is free open source software, so you can self-host your own multilabel archive. If you have a family of multilabel datasets you want to self-host, Cometa will make this a breeze.

Preparing your datasets

Cometa uses mldr software to extract metadata and partition datasets, so the first step is to install these R packages and import your datasets:

install.packages(c("mldr", "mldr.datasets"))

# This command reads mydataset.arff and mydataset.xml
mydataset = mldr("mydataset")

# Save your dataset in an R-friendly format
saveRDS(mydataset, file = "public/full/mydataset.rds")

Please refer to the mldr::mldr documentation for other ways to import your data.

Running Cometa

Cometa is easy to self-host thanks to its Docker image. First, install Docker by following the instructions on the official site. Now run Cometa in interactive mode by running the following command in your terminal:

docker run -itp 8080:80 --mount type=bind,source="$HOME/multilabel/public",target=/usr/app/public fdavidcl/cometa

Use the path to your public folder instead of $HOME/multilabel/public. Cometa will look for your datasets in RDS format inside full/.

Wait for the download to complete, and you should see a welcome message and some options to choose:

  1. Partition datasets
  2. Create summaries of your data
  3. Launch Cometa server
  4. Quit

If you want to serve partitions of your datasets, the first option will run different partitioning and cross-validation strategies through your data. Bear in mind this could take from minutes to several hours depending on the size of the datasets.

The second option will create the metadata necessary to display metrics and other information about your dataset in the website.

Once the metadata has been generated, you can run the web server and access http://localhost:8080 to browse your Cometa-powered website. In order to run the web server non-interactively, please use the following command:

docker run -dp 8080:80 --mount type=bind,source="$HOME/multilabel/public",target=/usr/app/public fdavidcl/cometa

This runs Docker in detached mode so you will not see a log or a terminal prompt.