Good question. Yes it is - but there are 2 levels of parallelism involved. Fine-grain and coarse-grain parallelism. I'm looking for opportunities to use both kinds - currently focusing on the coarse-grain type. Fine-grain is already being used in most places where it seems to fit.AdamK wrote:Why? Isn't processing in parallel faster?dml wrote:- rework code to minimize the duration of time spent using CPU+DSP at the same time - make those blocks of time as narrow as possible
1) Fine-grain parallelism is where the two chips communicate/synchronize continuously on the same single task and neither is free to be used for a different task until the task is done. The BSP algorithm works like this. This method works ok if both chips have a roughly equal balance of work and one side doesn't cause the other to wait a lot. It is inefficient if a lot of exchanges are required between the two sides and/or if one side carries most of the load.
Usually in practice it involves the DSP stopping/starting and idling frequently in many small bursts and it's usually pretty complicated to optimize. Most of the code works this way where it can but not everything can use it sensibly. e.g. scan converting polygons doesn't benefit from the CPU at all and reindexing big geometry in main memory can't use the DSP.
So there are hybrid cases which can use both chips on the one task, and other cases which are CPU-only or DSP-only tasks.
2) Coarse parallelism is where you take the remaining CPU-only and DSP-only tasks and try to run them at the same time, overlapping them as much as possible, without breaking anything.
What I meant by making the CPU+DSP cases as narrow as possible, is lifting out from those cases any code which is CPU- or DSP-only, and taking advantage of coarse-parallelism for those bits of code, executing them as separate routines. It doesn't compromise fine-grain parallelism at all, since it only affects pieces of code which isn't already working in a parallel way.
I built an excel sheet showing the approximate time spent in each task and which resources are used by each, to help with the overlapping of parallel work. I don't have time to complete it yet - but will be posted later.
