A Novel Sector-Based Algorithm for an Optimized Star-Galaxy Classification

Likhit, Anumanchi Agastya Sai Ram; Tripathi, Divyansh; Agarwal, Akshay

A Novel Sector-Based Algorithm for an Optimized Star-Galaxy Classification

Anumanchi Agastya Sai Ram Likhit^*, Divyansh Tripathi^*, Akshay Agarwal

^* Equal contribution

ICLR 2024 Tiny Papers

Code arXiv

Abstract

We introduce a sector-based method for star-galaxy classification using Sloan Digital Sky Survey data (SDSS-DR18). Instead of training on a single mixed sky distribution, the sky is segmented into sectors aligned with SDSS observation patterns, and a dedicated CNN is trained per sector setting. This improves robustness across local sky conditions and yields strong classification performance on sector-specific and combined evaluations.

Data Samples: Stars and Galaxies

Example stars and galaxies from Sector 07 and Sector 13

Representative SDSS stars and galaxies used in this study

It is difficult to differentiate between a star and a galaxy just by looking. Reliable separation usually needs spectroscopic information and additional contextual measurements.

Method at a Glance

Methodology pipeline from SDSS metadata and FITS retrieval to CNN training — End-to-end pipeline: SQL metadata query, FITS retrieval, preprocessing, and CNN-based classification.

Sky Division

36 sectors from RA (0°-360° in six 60° bins) and Dec (+90° to -90° in six 30° bins).

Data Workflow

SQL metadata to FITS URLs, crop to 45x45, convert to PNG, stack 5 filters, normalize, and augment.

Training Setup

Batch size 32, Adam optimizer, learning rate 0.001, binary cross-entropy for star-vs-galaxy output.

Evaluation Settings

Sector-specific, combined-sector, and unseen-sector resiliency via sector-7/13 and 4-way zero-shot tests.

Interactive Sector Map + Counts

Click or hover a sector to view RA/Dec range and star/galaxy counts. Highlighted sectors: 10 & 16 (studied) and 7 & 13 (generalization).

Studied (10, 16) Generalization (7, 13) No SDSS data

Results

Key quantitative results across sector-specific, combined, and generalization settings.

Main Metrics (Acc / P / R / F1)

Detailed metrics table for Sector-10, Sector-16, and combined data — Detailed metrics (Accuracy, Precision, Recall, F1) for Sector-10, Sector-16, and combined data.

Runtime per Epoch

Model	Sector-10	Sector-16	Combined
Proposed	15s	13s	25s
CovNet	80s	80s	180s
MargNet	1000s	570s	1610s

Generalization

Sector-7 accuracy: 0.95 (Proposed) vs 0.93 (MargNet)
Sector-13 accuracy: 0.93 (Proposed) vs 0.78 (MargNet)
Zero-shot average accuracy: 0.89 (Proposed) vs 0.78 (MargNet)
Zero-shot std: 0.04 (Proposed) vs 0.17 (MargNet)

Confusion Matrix Summary

Setting	TP	FP	FN	TN
Sector-10	935	53	44	968
Sector-16	963	25	44	968
Combined	1858	123	67	1952

Accuracy comparison in Sector 7 and Sector 13 — Sector-7 and Sector-13 accuracy: proposed model compared with MargNet.

Limitations and Future Work

Full-sky scaling: Extend training and evaluation from a subset of sectors to the complete 36-sector sky partition.
Domain-shift robustness: Improve stability when models are deployed on distant or previously unseen sectors.
Auxiliary context: Incorporate sector-level information such as sky position and survey metadata as additional inputs.
Model efficiency: Investigate slightly deeper yet computationally efficient architectures without sacrificing runtime performance.

BibTeX

@inproceedings{
              likhit2024a,
              title={A Novel Sector-Based Algorithm for an Optimized Star-Galaxy Classification},
              author={ANUMANCHI AGASTYA SAI RAM LIKHIT and Divyansh Tripathi and Akshay Agarwal},
              booktitle={The Second Tiny Papers Track at ICLR 2024},
              year={2024},
              url={https://openreview.net/forum?id=HzEefCle2c}
              }