Background: Sleep problems tend to vary according to the course of the disorder in individuals with mental health problems. Research in mental health has associated sleep pathologies with depression. However, the gold standard for sleep assessment, polysomnography (PSG), is not suitable for long-term, continuous monitoring of daily sleep, and methods such as sleep diaries rely on subjective recall, which is qualitative and inaccurate. Wearable devices, on the other hand, provide a low-cost and convenient means to monitor sleep in home settings. Objective: The main aim of this study was to devise and extract sleep features from data collected using a wearable device and analyze their associations with depressive symptom severity and sleep quality as measured by the self-assessed Patient Health Questionnaire 8-item (PHQ-8). Methods: Daily sleep data were collected passively by Fitbit wristband devices, and depressive symptom severity was self-reported every 2 weeks by the PHQ-8. The data used in this paper included 2812 PHQ-8 records from 368 participants recruited from 3 study sites in the Netherlands, Spain, and the United Kingdom. We extracted 18 sleep features from Fitbit data that describe participant sleep in the following 5 aspects: sleep architecture, sleep stability, sleep quality, insomnia, and hypersomnia. Linear mixed regression models were used to explore associations between sleep features and depressive symptom severity. The z score was used to evaluate the significance of the coefficient of each feature. Results: We tested our models on the entire dataset and separately on the data of 3 different study sites. We identified 14 sleep features that were significantly (P<.05) associated with the PHQ-8 score on the entire dataset, among them awake time percentage (z=5.45, P<.001), awakening times (z=5.53, P<.001), insomnia (z=4.55, P<.001), mean sleep offset time (z=6.19, P<.001), and hypersomnia (z=5.30, P<.001) were the top 5 features ranked by z score statistics. Associations between sleep features and PHQ-8 scores varied across different sites, possibly due to differences in the populations. We observed that many of our findings were consistent with previous studies, which used other measurements to assess sleep, such as PSG and sleep questionnaires. Conclusions: We demonstrated that several derived sleep features extracted from consumer wearable devices show potential for the remote measurement of sleep as biomarkers of depression in real-world settings. These findings may provide the basis for the development of clinical tools to passively monitor disease state and trajectory, with minimal burden on the participant.