Solving Raven’s Progressive Matrices

Raven’s Progressive Matrices

Designed & developed an AI agent in Python that could solve the 2×1, 2×2, 3×3 Visual IQ test problems using Knowledge Based AI techniques. Implemented as part of the projects in the Knowledge Based AI CS 7637 class with Prof. Ashok Goel.
The Raven’s Progressive Matrices (RPM) test is a commonly used test of general human intelligence. The RPM is somewhat unique as a general intelligence test in that it focuses on visual problem solving, and in particular, on visual similarity and analogy. Four projects were developed individually, building on each other to solve the propositional, imagistic, and multimodal representations of the RPM with increasing complexity, and to investigate how using different representations could contribute to a visual problem solving by an AI agent.

Background Research :

Possible techniques and algorithms were analysed to solve RPM. The following shortlisted methods were chosen based on their efficiency and accuracy. They were :

  • Generate & Test Method
  • Learning by Recording Cases
  • Case Based Reasoning
  • Logic

 

 

Technologies Used

Programming Language : Python
I decided to challenge my skills and added Python to my programming repertoire. The Agent was initially trained using propositional depiction of the Raven’s Problem with increasing relational complexity in the first three projects.

Computer Vision Library : OpenCV
Since the problems given to the agent were visual instead of propositional in the last project, the project agent had to be modified accordingly. OpenCV was chosen due to it’s ease of use and integration with Python. Again, OpenCV was completely new to me and I spent a lot of time digging through the documentation before reobtaining the accuracy of the first 3 projects. The findContours method was implemented to find both the contours of the image and the hierarchy of the objects found so as to find nested shapes.

Challenges Faced

Problems with Shape Duplication

Open CV was picking up both the inner and outer contours for each “un-filled shape”. This would mean that instead of finding one circle in 2×1 Basic Problem 01, it would discover 2. This lead to incorrect data since this was inconsistent with the logic used for deletion of objects from the figure.

To solve this the moment of the image was calculated, and then used the moment to find the centroid (cx, cy) and the area of the image. Theoretically, if the area and the centroid of the image were the same, then the shape was the same (and duplicated). In reality, the area between the outer contour and inner contour was really similar, and also the centroid would remain the same. To compensate for this, it was required for us to minutely add a threshold to the area comparison until differences could be discovered between images under the following circumstances :

  • The image was a single shape that was being discovered twice.
  • The image contained 2 or more shapes, one inside another, both of the same shape type (eg. both circles). In this case, 4 circles would be discovered and we would need to find the difference between deleting a duplicate image shape and deleting a real shape.

rpm2

 

To fix this, the threshold comparison of the area of the circle was adjusted. It was found that the optimal adjustment was a difference of 3000px in area. This accounted for changes in shape of small/large shapes as well as dual shapes one inside another.

Problems with Shape Rotation

While the Agent was able to calculate correctly a change in the rotation of the image, the direction of rotation would confuse the agent. Take Problem 17.

2x1BasicProblem17
A human (looking at the options) would know that there could be two different relationships drawn on the image depending on whether the image was rotated clockwise or anti-clockwise. Thus, the image could be rotated by either 90° or by 270° depending on which direction it was rotated from.
A workaround was implemented where the Agent would recognize rotation was implemented and then test the solution by implementing rotation in both directions against the results.

Problems with Shape Identification

Smaller images still cause errors by occassionally being recognized as different shapes. For instance, a small octagon would be mistakenly detected as a circle in 25% of the cases.

Future implementations could include a check of the moments of the angles to test if the shape was a polygon or a circle.

0