A Novel Sector-Based Algorithm for an Optimized Star-Galaxy Classification

Anumanchi Agastya Sai Ram Likhit*, Divyansh Tripathi*, Akshay Agarwal

* Equal contribution

ICLR 2024 Tiny Papers

Abstract

We introduce a sector-based method for star-galaxy classification using Sloan Digital Sky Survey data (SDSS-DR18). Instead of training on a single mixed sky distribution, the sky is segmented into sectors aligned with SDSS observation patterns, and a dedicated CNN is trained per sector setting. This improves robustness across local sky conditions and yields strong classification performance on sector-specific and combined evaluations.

Data Samples: Stars and Galaxies

Example stars and galaxies from Sector 07 and Sector 13
Representative SDSS stars and galaxies used in this study

It is difficult to differentiate between a star and a galaxy just by looking. Reliable separation usually needs spectroscopic information and additional contextual measurements.

Method at a Glance

Methodology pipeline from SDSS metadata and FITS retrieval to CNN training
End-to-end pipeline: SQL metadata query, FITS retrieval, preprocessing, and CNN-based classification.

Sky Division

36 sectors from RA (0°-360° in six 60° bins) and Dec (+90° to -90° in six 30° bins).

Data Workflow

SQL metadata to FITS URLs, crop to 45x45, convert to PNG, stack 5 filters, normalize, and augment.

Training Setup

Batch size 32, Adam optimizer, learning rate 0.001, binary cross-entropy for star-vs-galaxy output.

Evaluation Settings

Sector-specific, combined-sector, and unseen-sector resiliency via sector-7/13 and 4-way zero-shot tests.

Interactive Sector Map + Counts

Click or hover a sector to view RA/Dec range and star/galaxy counts. Highlighted sectors: 10 & 16 (studied) and 7 & 13 (generalization).

Studied (10, 16) Generalization (7, 13) No SDSS data

Results

Key quantitative results across sector-specific, combined, and generalization settings.

Main Metrics (Acc / P / R / F1)

Detailed metrics table for Sector-10, Sector-16, and combined data
Detailed metrics (Accuracy, Precision, Recall, F1) for Sector-10, Sector-16, and combined data.

Runtime per Epoch

Model Sector-10 Sector-16 Combined
Proposed 15s 13s 25s
CovNet 80s 80s 180s
MargNet 1000s 570s 1610s

Generalization

  • Sector-7 accuracy: 0.95 (Proposed) vs 0.93 (MargNet)
  • Sector-13 accuracy: 0.93 (Proposed) vs 0.78 (MargNet)
  • Zero-shot average accuracy: 0.89 (Proposed) vs 0.78 (MargNet)
  • Zero-shot std: 0.04 (Proposed) vs 0.17 (MargNet)

Confusion Matrix Summary

Setting TP FP FN TN
Sector-10 935 53 44 968
Sector-16 963 25 44 968
Combined 1858 123 67 1952
Accuracy comparison in Sector 7 and Sector 13
Sector-7 and Sector-13 accuracy: proposed model compared with MargNet.

Limitations and Future Work

  • Full-sky scaling: Extend training and evaluation from a subset of sectors to the complete 36-sector sky partition.
  • Domain-shift robustness: Improve stability when models are deployed on distant or previously unseen sectors.
  • Auxiliary context: Incorporate sector-level information such as sky position and survey metadata as additional inputs.
  • Model efficiency: Investigate slightly deeper yet computationally efficient architectures without sacrificing runtime performance.

BibTeX

@inproceedings{
              likhit2024a,
              title={A Novel Sector-Based Algorithm for an Optimized Star-Galaxy Classification},
              author={ANUMANCHI AGASTYA SAI RAM LIKHIT and Divyansh Tripathi and Akshay Agarwal},
              booktitle={The Second Tiny Papers Track at ICLR 2024},
              year={2024},
              url={https://openreview.net/forum?id=HzEefCle2c}
              }