Federated Learning + Person Re-identification: Benchmark, In-Depth Analysis, and Performance…

Original article was published by Weiming on Deep Learning on Medium


Federated Learning + Person Re-identification: Benchmark, In-Depth Analysis, and Performance Optimization

This paper is accepted in ACMMM’20 Oral.

Personal re-identification (ReID) is an important computer vision task, but its development is constrained by the increasing privacy concerns. Federated learning is a privacy-preserving machine learning technique that learns a shared model across decentralized clients. In this paper, we implement federated learning to person re-identification (FedReID) and optimize its performance affected by statistical heterogeneity in the real-world scenario by analyzing insights from a newly constructed benchmark.

Statistical Heterogeneity: (1) non-idependent and identifical distribution (non-IID) data; (2) Unbalanced data volume of clients

This article aims to give a brief introduction of the paper from three points:

  1. Benchmark, including datasets, a new federated algorithm, and scenarios
  2. Insights of benchmark analysis
  3. Performance optimization methods:

Benchmark

Datasets

The benchmark contains 9 most popular ReID datasets, whose details are as follows:

These datasets source from different domains and vary in the number of images and identities, which simulates the statistical heterogeneity in the real-world scenario.

Algorithm: Federated Partial Averaging

We propose a new algorithm — Federated Partial Averaging (FedPav) because the standard federated algorithm Federated Averaging (FedAvg) requires the identical model in the server and the clients. While client models are different in FedReID, because in each client, the dimension of the classifier depends on the number of ids, which is different in different datasets in clients.

FedPav only syncs part of client models with the server. The training steps with FedPav are as followed:

  1. A server sends the global model to clients.
  2. Clients use local data to train the models with their classifiers, obtaining local models
  3. Clients upload the backbone parameters
  4. The server aggregates model updates, obtaining a new global model.

Insights from Benchmark Analysis

We provide several insights by analyzing the benchmark results in the paper. In the article, we highlight two of them that reveal the impact of statistical heterogeneity.

1. Large datasets in Federated Learning achieve lower accuracy than local training