Polyaxon v0.4.3: stable — Improved dashboard, setup, and documentation

Source: Deep Learning on Medium


Go to the profile of Mourad

Polyaxon v0.4.3: stable — Improved dashboard, setup, and documentation

Today, we are pleased to announce the v0.4.3 release, a stable version with improved functionalities and documentation.

This release also marks the anniversary of several clusters running non-stop for over year, the adoption of the platform by several Fortune 500 companies, and millions of experiments tracked since the initial release (based on the opt-in metrics reporting).

Polyaxon is now moving towards semantic versioning, and will provide versioned documentation and migrations notes from one version to another starting from the next release v0.5.

Polyaxon Dashboard

Polyaxon is an open source, self-hosted, platform for model management and machine learning (ML) and deep learning(DL) workload orchestration on Kubernetes.

Install or upgrade Polyaxon.

Developing Polyaxon as an open source, and having a product used by thousands of machine learning and deep learning practitioners every week, means we receive a constant stream of feedback; in particular one recurrent request: “Improving functionality X would help me more than adding new features.

That’s why this release focuses on productivity and stability improvements that every Polyaxon user (machine learning practitioners and DevOps engineers) can benefit from.

  • Effort to bring more CLI functionalities to the dashboard
  • Better Ingress support and documentation
  • Better SSL documentation
  • Support of Custom Cluster DNS
  • Improved Helm chart
  • Better deployment config validation

More functionalities in the dashboard

When we started Polyaxon, the dashboard was just a nice thing to have to complement and give a visual aspect of the operations created by the CLI. Throughout the last couple of months we have been adding more functionalities and features to the dashboard, to enable users to bookmark, archive, restore, delete, and filter for jobs and experiments.

The adoption of the dashboard has increased, and we noticed that most users look for functionalities in the dashboard before checking the reference documentation for the CLI.

In this last release, we made a lot of changes to the dashboard to enable users to do more and most of the important actions without the need to use our CLI:

  • Reenabling the possibility to create projects from dashboard
  • Adding the possibility to create experiments, experiment groups, jobs, builds from dashboard
  • Starting & Stopping notebooks from the dashboard
  • Starting & Stopping tensorboards from the dashboard for experiments, groups, selections, and projects
  • Restarting experiments
  • Tables with search enabled to quickly check the history of created and running notebooks and tensorboards
  • Better handling of errors, loading indicators, and several UX improvements
  • Adding default searches to the tables to enable users to quickly filter based on some statuses
  • Quick shortcuts to go directly to integrations, references, and other documentation sections.

You should expect more features and improvements in the next couple of months, in particular:

  • Possibility to create/sync templates to use for starting experiments and jobs
  • Simpler search and filtering with UI components to help you create queries in an intuitive way
  • Better visualization
  • Better artifacts tagging and management
  • Better insights and metrics on your jobs and experiments per project and cluster wide
  • Move the cluster wide dynamic configuration from EE to CE to easily set and update deployment options

Improved Ingress support and documentation

Polyaxon and NGINX ingress documentation

Prior to v0.4.3, Polyaxon Helm Chart included an NGINX Ingress controller, that made our ingress resource useless to many companies. Although we provided a way to modify the annotations it was not clear what to change, which made several large team opt-out of using our ingress and create their own.

In this version Polyaxon Helm Chart does not include an Ingress controller anymore, and has generic ingress resource that can be used with any Ingress Controller. We made a reference documentation for NGINX, and we will update with more documentation on how to use other stable and most used ingress controllers in the future.

Better Support of Custom Cluster DNS

Everyday, several users try to deploy Polyaxon on custom Kubernetes clusters, and often times with a Custom Cluster DNS. Polyaxon deployments fail on these custom clusters sometimes. We suggested for users who reached out to us to use KubeDNS, but several others just move on without having the chance to try our platform.

We now have documentation how to configure Polyaxon to work with any DNS backend or Custom Cluster DNS, making the platform compatible with most Kubernetes installations.

Better SSL documentation

The recommended way to serve Polyaxon HTTPS is by using an ingress with TLS enabled, the new ingress resource gives complete control over how to deploy the ingress controller. However, some users decide to deploy Polyaxon with a NodePort service type exposed on the internet, and we are also in the process of releasing a public version of Polyaxon Tracking on docker, docker-compose, and other container services, so we added documentation reference on how to setup SSL for Polyaxon API using self-signed certificates or browser trusted certificates.

Improved Helm chart

We made several improvement to Polyaxon Helm Chart to make it readable and removed several pluggable components. Namely we removed the NFS-Provisioner from the chart and we made it as an external repo to give users the possibility to use it, as an option, to provision MultiReadWrite volumes for hosting their data, outputs/artifacts, and logs.

Better deployment config Validation

Polyaxon deploy is now in “stable beta” and can be used to validate your config deployment files as well as other necessary dependencies before installing or upgrading Polyaxon.

We will be adding a new documentation section to outline how to do some advanced stuff like, specifying the version and deployment type on deployment config file, to not upgrade Polyaxon to a newer version by mistake for example.

Other improvements and bug fixes

We fixed several issues and regressions, one notable regression was related to authenticating native docker builder to pull private DockerHub images, we also published new documentation for private DockerHub images in addition to the generic docker registries’ documentation.

Polyaxon and dockerhub documentation

We fixed the issue with browser caching requiring users to hard reload their page after an upgrade, and several other small UX issues.

We also update the documentation and the chart to set the “eventMonitors” service as a singleton.

Conclusion

Polyaxon is now stable and production ready, although it was already used in production mode at several companies, and it’s powering thousands of researchers and machine learning practitioners every week.

We are very thankful to all companies, teams, and research institutions who used our platform and provided feedback to improve it.

Next release, we will be pushing more automation, we will release a public Beta of the Machine Learning CI tool and Polyflow an events/actions based framework.

Polyaxon will keep improving and providing the simplest machine learning layer on Kubernetes. We hope that these updates will improve your workflows and increase your productivity, and again, thank you for your continued support.