In general, I support going with actual dates rather than ages, but this is one of the exceptions. In our data processing, we go by age rather than date. Federal TEDS (Treatment Episode Data Set) requirements use age of first use for substance abuse tracking, as it can be very difficult getting exact dates down. There might be some subset of your population which does know exact dates, but that percentage is probably low enough that "date of first use" is too noisy a signal to do anything reasonable with. Even years may be hazy for some people, but it's more reasonable that somebody could remember how old they were when they started drinking than on which date they started.
The other reason why an age would be OK from a data modeling perspective is that it doesn't change. Unlike a "3 years ago" type of field, "age of first use" is static: a person who began drinking at 18 will always have begun drinking at 18. If you were to have a field which has "how many years ago did you begin?" that would be a problem. It doesn't sound like you're doing that, though, so that's safe.
↧