In short it is a try and error process. But there are a few short ways to reach to your results fast.
- How about a CNN to label each pixel as "worn metal", "normal metal", "mud" and so on ... Deep networks can learn lighting‐invariant and texture‐invariant features better than manual filters/thresholds. But in order for you to trin it that way, you need to Acquire a labeled dataset of images, each with pixel‐wise annotations (e.g. polygons or masks) of worn vs. non‐worn regions.
- One of the other things is YOLO or faster R-CNN, if you have not heard of it it might be very useful for you. This might be an interestig pick if you have less data or if you are manully anotating the pictures.
The challenges you are facing, let a black box decide it's fate and you help it by providing it the righty annotated data. Shift to CNN.
Karan
