I'm working with our compliance department and looking to model out our complaint data. I'm using some automation tools to help me build this out but automation tools are not a substitute for solid modeling techniques. As I've been profiling the data and working with the business users in the department I've been told the business users identify complaints by the following columns: ReceivedDate, ApplicationID, AccountNumber, BorrowerName, and LendingPartnerName. The following columns (ClosedDate, ComplaintTierId, Description, Resolution, Notes, ComplaintStatusId, ComplaintStatusDate) change over time for various reasons (i.e. complaint has been resolved and closed). There are a few tables that are used to describe certain events I've modeled these as reference tables.
Here is the issue I face. When I hash the business keys I see duplicate values, in some cases there are true duplicates and these I can handle in de-dup. However, as a the change columns change over time new records are inserted into the table creating a "duplicate" record. I say duplicate but these new records are not true duplicates as the complaint status may have been updated or the tier may have been changed.
Currently, I have this modeled as one hub, four reference tables and two satellites (one with low volatility and one with high volatility, which tracks my change columns as noted above). From the information above is the the correct path or should I consider a different approach? Happy to provide more info or diagrams if needed.