I once wrote an add-on for the SQL Server Accelerator for Business Intelligence to provide unknown-member functionality. I was therefore naturally interested when I heard that it comes out-of-the-box with Analysis Services 2005.
The idea with the unknown member is that, for each level of a hierarchy (or for each attribute in Analysis Services 2005), there is an unknown member. Members at a lower level that cannot be resolved (no matching parent), or fact table rows that cannot be resolved, are mapped to the unknown bucket.
Why would we want to do this?
As we often gather data from various sources, it will sometimes not be fully synchronised. Let’s say we have a fact table that is dimensioned by 20 dimension tables. Some fact data happens to come in that is resolved to 19 of the dimensions, but not to one less important dimension. This might be because the dimension gets data from a different source to the fact table and it has not yet been refreshed. When browsing the cube by any of the 19 dims, we cannot tell that another dimension is unresolved. When browsing by the unresolved dim, the data appears under the unknown bucket. Then, 1 or 2 ETL cycles later, the dimension is refreshed; the facts are fully resolved and relate to the correct member.
Let’s say the fact table has a Sales measure. If we do not allow for the unknown member, we will get incorrect sales amounts across all 20 dimensions because not all the sales data is accounted for.
The reasons given by Ralph Kimball for using integer surrogate keys also highlight the value of the unknown member. We cannot always rely on the consistency of database IDs from production. This can result in unmatched fact-table rows for certain dimensions. The measures in the cube can then still take account of these rows until the discrepancies are resolved.
Utilising the unknown member results in a more robust ETL. It can avoid errors that are time consuming to resolve and cause an ongoing management overhead.
How would we implement this?
We would typically have a foreign key relationship in the relational database between a fact table and each dimension table. On the face of it, it would therefore be impossible to insert fact table rows that cannot be resolved to all referenced dimensions.
Whether we follow Kimball’s design methodology on integer surrogate keys affects how we deal with the unknown member. I will cover both cases.
Without Integer Surrogate Keys
In Analysis Services 2005, we define the foreign key constraint logically in the DSV. It is not propagated back to the relational source. (The relational source could optionally have a disabled foreign key constraint representing this relationship.) We can therefore insert fact table rows that are not necessarily referenced by the parent dimension. In this case the unknown member will just ‘work’.
A number of the problems outlined by Kimball regarding not using surrogate keys are overcome by Analysis Services 2005 providing out-of-the-box unknown member functionality. For example, if dimension keys come in at fact level that are not formatted as expected, they will reside under the unknown bucket until the discrepancy has been resolved.
The biggest value-add resulting from the out-of-the-box Analysis Services 2005 unknown-member functionality applies to OLAP systems that do not use integer surrogate keys.
Dim Properties in Analysis Services 2005
The following screenshot shows a couple of dimension properties in Analysis Services 2005 for the unknown member.

With UnknownMember set to Visible, we get the behaviour described above. Hidden means the unknown member is not visible, but the aggregations still take account of it. None means the unknown member is not used.
The UnknownMemberName property just allows us to change the name of the unknown member from the default of ‘Unknown’.
With Integer Surrogate Keys
However, I am in the integer surrogate-key ‘camp’. In this case, we can create a physical unknown member by having a dimension-table row with an ID of -1. Then, as part of the ETL, any unresolved fact-table rows can be mapped to this special dimension row.
Integer surrogate keys inherently provide a structure that lends itself to the unknown member.
With Analysis Services 2005, we don’t actually need to bother with the physical dimension member (dimension row with ID of -1). Since foreign key relationships allow nulls in the child table even if no parent row has a null key (null doesn’t equal null anyway), we can have the foreign-key field in the fact table holding a null value for unresolved rows. The out-of-the-box Analysis Services 2005 functionality will automatically map such rows to the unknown bucket.
Whether we use null in the fact table or a physical dimension member doesn’t really make much difference. I will show an example that uses a physical dimension member, but the similar logic would be applied in either case.

We have 2 tables: the product dimension table (DimProd) and the sales fact table (FactSales). The ID from the transactional system is held in DimProdCode for reference.
In order to allow for the unknown member, we will need to store the product ID from the transactional system in the fact table as well (although we can isolate this to the fact table for the ‘unknown partition’ – see below). During the current ETL cycle, there is a fact with an ID from the transactional system of ‘JKL’. As there is no corresponding product for this row, it has a DimProdID value of -1.
At the next ETL cycle, the product with transactional ID of ‘JKL’ comes in and the dim table is refreshed. The ETL must then update the -1 value with the newly generated value for DimProdID. After cube processing, there would be nothing in the unknown member.
Partitioning for Unknowns
If we are dealing with large volumes and minimising the cube-processing window is important to us, we can take advantage of measure-group partitions. Let’s say we want to incrementally process our cubes in order to save time. Given the dynamic nature of the data in the fact table due to the unknown member, we have to perform full processing unless we partition accordingly. Let’s use the above (very simple) example to illustrate how we would do this.

We are splitting up the fact table into 2 tables; one to hold fully resolved facts and the other to hold facts that are unresolved to at least one dimension. This allows us to incrementally process the main partition for FactSales (which would typically hold the vast majority of the data), and fully process ‘unknown partition’ for FactSales_Unk. This could save a big chunk of cube processing time.
Note that FactSales no longer requires the DimProdCode column because all rows can be looked up against the DimProd table.
ETL-Cycle Process with Unknown Partitioning
Let’s follow through a possible sequence of events for a group of fact table rows. Again, our hypothetical fact table is dimensioned by 20 dimensions.
ETL Cycle 1
Facts come in that are unresolved for 3 of the 20 dimensions. They are therefore placed in the unknown partition.
ETL Cycle 2
The dimension data is refreshed and there is now only a single unresolved dimension for the facts. Some of the -1 values are updated accordingly, but the data remains in the unknown partition.
ETL Cycle 3
The dimension data is again refreshed and all 20 dimensions are now fully resolved. From this point, the data becomes much more static, so it is deleted from the unknown partition and incrementally loaded into the main partition.
Conclusion
I think the unknown member is pretty useful. Setting it up properly for large volumes with incremental processing can result in a little extra development time. This extra setup is obviously not required if the cubes are fully processed.
The out-of-the-box Analysis Services 2005 functionality for the unknown member is primarily geared to OLAP systems that do not use integer surrogate keys.
The unknown member can result in a more robust ETL. Not to mention the fact that it can prevent incorrect numbers being displayed to the users (God forbid!).