Computer Vision

For now, see machine_learning and self_driving_cars

Conferences

CVPR 2023
- Recent Advances in Vision Foundation Models

From Images

You Only Look Once: Unified, Real-Time Object Detection, J. Redmon et al (2016), http://pjreddie.com/yolo/, Y Liao et al (2021)
Latent Space: Segment Anything Model and the Hard Problems of Computer Vision — with Joseph Nelson of Roboflow (2023)
- Check out the roboflow.com video demo and hands-on demo. SAM makes image annotation as simple as a point-click.
A. Takmaz et al: OpenMask3D: Open-Vocabulary 3D Instance Segmentation (2023)

From Videos

Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition (2023)

From Point Clouds

PapersWithCode: PointPillars
- PointPillars: Fast Encoders for Object Detection from Point Clouds, Alex H. Lang et al (2019)
- Optimisation of the PointPillars network for 3D object detection in point clouds, J. Stanisz et al (2020)
Y. Guo et al: Deep Learning for 3D Point Clouds: A Survey (2020)
Jinyu Li et al: PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds (2023), github

Annotation

Scale.ai
AWS Ground Truth
Labelbox (Databricks)
CVAT (open source), supports point cloud
Latte (open source), supports point cloud
Roboflow
Charles Qi et al, Waymo: Offboard 3D Object Detection from Point Cloud Sequences (2021), youtube
Zhaoqi Leng et al, Waymo: LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations (2022)

Curation

Scale.ai
Voxel51 (open source, modular design)
- E. Hofesman:The ML Menu for Model Selection: Hugging Face, Weights & Biases, and FiftyOne (2023)

Companies

Common Sense Machines: CSM.ai, Video to 3D, 3D World Generation, NeRF… Located in Cambridge, MA.

Other