Skip to content

jasonleex1995/AI-Paper-Digest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 

Repository files navigation

AI Paper Digest

  • Super-brief summaries of AI papers I've read
  • May not perfectly align with authors' claims or intentions
  • Some papers I think important include detailed summary links, which leads to my blog posts

AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection
ICLR 2024, arxiv, review, code, summary
Task: zero-shot anomaly detection (ZSAD)
Previous works use CLIP with object-aware text prompts.
Even though the foreground object semantics can be completely different, anomaly patterns remain quite similar.
Thus, use CLIP with learnable object-agnostic text prompts.

Tiny and Efficient Model for the Edge Detection Generalization
ICCV 2023 Workshop (Resource Efficient Deep Learning for Computer Vision), arxiv, code
Task: edge detection
Propose simple, efficient, and robust CNN model: Tiny and Efficient Edge Detector (TEED).
TEED generates thinner and clearer edge-maps, but requires a paired dataset for training.
Two core methods: architecture (edge fusion module) & loss (weighted cross-entropy, tracing loss).
Weighted cross-entropy helps to detect as many edges as possible, while tracing loss helps to predict thinner and clearer edge-maps.

Learning to generate line drawings that convey geometry and semantics
CVPR 2022, arxiv, code, website
Task: automatic line generation
View line drawing generation as an unsupervised image translation problem, which means training models with unpaired data.
Most previous works solely consider preserving photographic appearence through cycle consistency.
Instead, use 4 losses to improve quality: adversarial loss (LSGAN), geometry loss (pseudo depth map), semantic loss (CLIP), appearance loss (cycle consistency).

Generalisation in humans and deep neural networks
NeurIPS 2018, arxiv, review, code, summary
Task: understanding the differences between DNNs and humans
Compare the robustness of humans and DNNs (VGG, GoogLeNet, ResNet) on object recognition under 12 different image distortions.
Human visual system is more robust than DNNs.
DNNs generalize so poorly under non-i.i.d. settings.


format

**paper title**  
*accept info*, [arxiv](), [review](), [code](), [website](), [summary]()  
super-brief summary