1.) You can build interrupts every 32 clock tics, =1/32768 * 32 = 0,0009765625 seconds per interrupt.
it is near to 1ms, for exact timing over a long period you can use the delta scheme.
Each time you got an interrupt you add 9.765.625 to a long named myTime, each time myTime is over or equal 10.000.000 you subtract 10.000.000 from myTime and inc myMs.
The max error at a specific time will be 1ms. But the error doesn’t grow.
You can also divide the two big numbers by 5^7, so you can handle with bytes instead of longs.
I agree with what Jan said and like to add the following to approach (2).
I think you should use MSP430F2001 (or any other MSP430F20xx) and a good 32.768kHz crystal. Set and periodically adjust DCO to 8192kHz using the 32.768kHz ACLK as a reference.
F2xx has a much better DCO as compared with F1xx or F4xx and can operate at up to 16MHz.
F2001 is cheap (<$1) and available in both 14-pin DIP for prototyping and small (3mm x 3mm) pin-less package for volume production. 1KB Flash, 128B RAM and 10 GPIO pins should be enough for your application.
the SLAA336a (“Using the DCO Library”) is nice, but don’t forget to build an ACKL-Interrupt to supress the dco-error over long times.
You have to calibrate more than once, because the DCO-Frequency will change significant if the temperature is changing (a touch with a finger could be enough or changing the state of a led near to the cpu).
The DCO of F2xx is much better than that of F1xx. It has much smaller temperature and voltage dependency.
However, when you vary the voltage from 1.8V to 3.6V and vary the temperature from -40 to +105 degrees C, the DCO of F2xx still varies by typically ±2% (min -5%. max 5%). Thus in order to get better accuracy, you still need to re-calibrated DCO periodically.