Forums

定制CPU设计

开始于 里克C 2020年4月16日
On Tuesday, April 21, 2020 at 12:09:56 PM UTC-4, David Brown wrote:
> On 21/04/2020 17:26, Rick C wrote: > > On Tuesday, April 21, 2020 at 10:20:41 AM UTC-4, Theo wrote: > >> David Brown <david.brown@hesbynett.no> wrote: > >>> On 17/04/2020 17:34, Theo wrote: > >>>> I think part of the problem is the ARM licensing cost - if the > >>>> license cost is (random number) 5% of the silicon sticker price > >>>> that's fine when it's a $1 MCU, but when it's a $10000 FPGA > >>>> that hurts. > >>> > >>> I'm not sure that's valid. First, do you know that the ARM > >>> licensing costs work that way? > >> > >> I have no insight into the licensing contracts (which are likely > >> very confidential), but what I understand is that all Stratix 10 > >> parts have an ARM but relatively few have it enabled. Additionally > >> I understand the licence cost is only paid for parts where it is > >> enabled. From that I surmise that the licence cost is significant; > >> if the cost was minimal then why have a separate SKU without the > >> ARM? > > > > They do the same thing with the FPGA itself. It is not inexpensive > > to spin the masks for FPGAs at the bleeding edge of semiconductor > > fabrication technology. So they sell parts with more or less of the > > part enabled or even just tested (testing cost in an FPGA is not > > inexpensive). So you buy an FPGA with 50,000 LUTs or you buy one > > with 25,000 LUTs and it's the same part. The 50,000 part has the > > entire chip tested, the 25,000 LUT part only tests the section with > > 25,000 LUTs you will be using. They will get the price even lower if > > you are buying a large quantity and you give them your design, so > > they only test the parts of the chip your design uses! > > > > So don't test the CPU and don't pay the license fee. Save some on > > the license and save more on not testing the CPU and various > > supporting logic. > > > > > >> One other possibility is that a separate SKU allows the ARM to be > >> faulty and the part still saleable, but it seems that ballpark > >> 80-90% of the eval boards I see are offering parts without ARMs. > >> Which suggests there's a strong motivation not to use it. > > > > I'm told if a chip fails a test, it is tossed. The savings comes > > from not testing a section to begin with. Testing equipment is not > > cheap and FPGAs take a lot of time on the beast. > > That would make it different from many other large, complex parts where > disabling failed sections and even having redundant parts in the design > increase overall yields and lower costs. But I guess it depends on a > balance between yields, types of failure, and testing costs.
If you think about it a bit you will see the only real way to have "redundancy" in FPGAs is to excise entire sections of the chip for a single failure. So a 50 kLUT chip will become a 25 kLUT chip if it has a failure(s) in one half. That's all I've heard of. Trying to replace a small section of a chip to retain the full functionality would result in uneven delays and that's a real problem in FPGAs. -- Rick C. ---++ Get 1,000 miles of free Supercharging ---++ Tesla referral code - //ts.la/richard11209
On 21/04/20 20:42, Rick C wrote:
> Yep. I don't restrict myself to any given techniques. I don't find the XMOS > good for much other than a small set of apps that can utilize the independent > cores and at the same time need the complexity of larger bodies of software. > Anything that doesn't need the multiprocessing can be done on an MCU and > anything that doesn't need a larger code base can be done in an FPGA, likely > for less money in both cases.
It is noteworthy that you are able to confidently state that, even though you can't be bothered to read a four page article summarising the XMOS ecosystem's unique features and benefits. If you don't read, you can't understand, and statements become highly dubious.
On 21/04/20 22:01, Rick C wrote:
> On Tuesday, April 21, 2020 at 11:58:43 AM UTC-4, David Brown wrote: >> The simplest way to get a deadlock is to have two shared resources, and >> two processes (hardware modules, software tasks, whatever) that need >> both the resources, but acquire them in different orders. But you don't >> usually get such simple cases, as they are so obvious. > > That was my point, I've never designed hardware that had "resources" to allocate. Maybe my designs are just too simple. I do like to keep things simple when I design. I've never not been able to do that when designing hardware. > > I'm familiar with deadlock from software design, just don't see it in hardware so far.
Here's a real-life example which has subtle problems. Consider a token ring network in the presence of failures which split and reconnect the ring. Your task is to have exactly one token circulating, not zero, not two. Prove your solution meets that criteria.
On Tuesday, April 21, 2020 at 9:20:41 AM UTC-5, Theo wrote:
> David Brown <david.brown@hesbynett.no> wrote: > > On 17/04/2020 17:34, Theo wrote: > > > I think part of the problem is the ARM licensing cost - if the license cost > > > is (random number) 5% of the silicon sticker price that's fine when it's a > > > $1 MCU, but when it's a $10000 FPGA that hurts. > > > > I'm not sure that's valid. First, do you know that the ARM licensing > > costs work that way? > > I have no insight into the licensing contracts (which are likely very > confidential), but what I understand is that all Stratix 10 parts have an > ARM but relatively few have it enabled. Additionally I understand the > licence cost is only paid for parts where it is enabled. From that I > surmise that the licence cost is significant; if the cost was minimal then > why have a separate SKU without the ARM? > > One other possibility is that a separate SKU allows the ARM to be faulty > and the part still saleable, but it seems that ballpark 80-90% of the > eval boards I see are offering parts without ARMs. Which suggests there's a > strong motivation not to use it. > > > (And whatever the numbers, RISC-V changes things significantly.) > > I'm not sure RISC-V is to the level of maturity for baking a Cortex A53 > equivalent into a critical product. > > Theo
It is fairly common for FPGA vendors to sell subsets of a physical chip by disabling a portion of the chip (for example effectively cutting the chip in half or disabling a processor core). Look at the number of configuration bits.
On 21/4/20 8:47 pm, Rick C wrote:
> On Monday, April 20, 2020 at 8:23:30 PM UTC-4, Clifford Heath wrote: >> On 21/4/20 10:01 am, Rick C wrote: >>> On Monday, April 20, 2020 at 8:59:36 AM UTC-4, Clifford Heath wrote: >>>> On 20/4/20 4:52 am, Przemek Klosowski wrote: >>>>> On Thu, 16 Apr 2020 17:13:41 -0700, Paul Rubin wrote: >>>>> >>>>>> Grant Edwards <invalid@invalid.invalid> writes: >>>>>>> Definitely. The M-class parts are so cheap, there's not much point in >>>>>>> thinking about doing it in an FPGA. >>>>>> >>>>>> Well I think the idea is already you have other stuff in the FPGA, so >>>>>> you save a package and some communications by dropping in a softcore >>>>>> rather than using an external MCU. I'm surprised that only high end >>>>>> FPGA's currently have hard MCU's already there. Just like they have DSP >>>>>> blocks, ram blocks, SERDES, etc., they might as well put in some CPU >>>>>> blocks. >>>>> >>>>> Maybe Risc-V will catch on. The design is FOSS, as is the toolchain (GDB >>>>> and LLVM have Risc-V backends already for a while), and the simple >>>>> versions take very few gates. >>>>> //github.com/SpinalHDL/VexRiscv >>>>> //hackaday.com/2019/11/19/emulating-risc-v-on-an-fpga/ >>>>> >>>> >>>> There's a lot of push in the direction of the Power architecture. What >>>> does that look like in FPGA? >>>> >>>> CH >>> >>> Do you mean the Power PC? That was the hard IP used in the very old and possibly obsolete Virtex II Pro devices. >>> >>> Why do they have to use such goofy names like "Pro" or "Polarfire". Do they really think that sells even one frigging chip? I would be so much more inclined to dig through their information if they just had decent names that give you some idea of the technical details including the heritage. >> >> Hah! I hear you. :) >> >> Yes, I mean PowerPC, specifically OpenPoWER: >> <//en.wikipedia.org/wiki/OpenPOWER_Foundation> >> >> A friend is a fan. Me, I haven't read much about it. >> >> CH > > Interesting. I don't see any mention of the Power architecture in James Brakefield's soft IP charts
David corrected me - Power != PowerPC.
> I want to get back to my stack/register ISA design at some point. I've looked at the potential for it being an efficient instruction set (which it is), but I haven't explored the implementation issues for the hardware fully yet.
For my computer architecture subject in 1980 I designed an 8-bit stack architecture machine that had a 4-bit "arithmetic" mode that just did operations on the stack... similar to the "thumb" mode of ARM. Possibly even close enough to invalidate the ARM patents in the area. Either way, it would be very simple to create this CPU an an FPGA, and it was designed to be very easy to compile for (from C). I got full marks for the project, and that entire subject. CH
On Tue, 21 Apr 2020 18:10:41 +0000 (UTC), Grant Edwards
<invalid@invalid.invalid> wrote:

>On 2020-04-21, upsidedown@downunder.com <upsidedown@downunder.com> wrote: > >> With EtherCAT, it is possible to make very small nodes with only a few >> bits added/removed from the frame circulating around the industrial >> plants with only a few bit time additional propagation delay in each >> node. So it looks good. > >For something like simple digital I/O, you don't need a uController at >all, the Beckhoff ET1100 EtherCAT controller can act as a stand-alone >slave device.
With Beckhoff bus terminals, you can stack a number of simple I/O modules together. The module could be as simple as a 2 digital input or an other with 2 digital outputs or more complicated I/O modules. At the end of the stack you just attach a fieldbus module. This could be e.g. Modbus, CAN bus or in your case an EtherCAT module. You can change e.g. from CAN to EtherCAT by simply replacing the fieldbus module at the end. No need to disassemble the I/O module stack. Even this example shows the problem of interfacing only a few I/O bits to a higher level system. It doesn't make sense to make fieldbus interfaces if you just need say 2 digital inputs. In this case, the Beckhoff bus terminal stack acts as a concentrator, so that discrete signals from a larger area is wired into the module stack. While logically the EtherCAT protocol would allow nodes to effectively handle only a few digital I/O on each node, it is not economically practical. Of course, if EtherCAT I(O controllers can be made into 8 to 14 pin chips, the situation would be different (2 x power, 2 x Ethernet, 2-4 digital I/O pins), the situation would be different. But you would still need two magnetics.
> >> What is the point of using multicore processors, if a single core can >> perform the basic EtherCAT node functionality. > >What if you also want to run a web server and some other heavy-duty, >encrypted, protocols under Linux in your EtherCAT slave?
Would one really want to have a large number of such stations all around a plant, each exchanging only a few at bits ? Use some hierarchical system, but the expected advantage of EtherCAT is lost.
>The most >practical way to do that is with something like the Renesas RZ/N1D >which has an EtherCAT controller, a Cortex M3 optimized for real-time >stuff, and a couple Cortex A7 cores for running Linux. [There are >other vendors with similar mult-core uControllers.] > >> In addition, if there are dozens of series connected twisted pair >> connectors, what is the electromechanical reliability of each >> connection ? A single fault will prevent the Ethernet frame >> circulating back to the master. > >If single point of failure is an issue, then you can connect the >EtherCAT devices in a loop to get some redundancy.
The EtherCAT has the same reliability issues as 10Base2 and 10Base5 coaxial Ethernets with a large number of connections to a single bus.
On 21/04/2020 21:42, Rick C wrote:
> On Tuesday, April 21, 2020 at 11:25:37 AM UTC-4, David Brown wrote: >> On 21/04/2020 15:15, Rick C wrote: >>> On Tuesday, April 21, 2020 at 8:02:18 AM UTC-4, David Brown >>> wrote: >>>> On 21/04/2020 02:36, Rick C wrote: >>>>> On Monday, April 20, 2020 at 9:58:09 AM UTC-4, David Brown >>>>> wrote: >>>>>> On 18/04/2020 21:38, Rick C wrote: >>>>>>> On Saturday, April 18, 2020 at 9:06:57 AM UTC-4, David >>>>>>> Brown wrote: >>>>>>>> >>>> >>>>>> I need an MCU with 4 EtherCAT slave channels. There are >>>>>> exactly 0 on the market. There are only two or three in >>>>>> total - from all manufacturers together - with even /one/ >>>>>> EtherCAT slave. >>>>> >>>>> Yes, because EtherCAT is not widely used at the moment. I >>>>> had never heard of it. When I read about it I see some car >>>>> makers are looking at adopting it. Once that happens there >>>>> will be MCUs supporting the interface. Until then it is a >>>>> niche market. Am I wrong? I don't see any indication there >>>>> is much out there either in the supply or demand side. >>>>> >>>> >>>> EtherCAT has been increasingly popular in industrial automation >>>> (the world of Programmable Logic Controllers, Profibus, >>>> Frequency Converters, etc.). >>> >>> You say "increasingly popular" but if it were being used in >>> higher volumes MCUs with EtherCAT interfaces would be available. >>> MCU makers aren't stupid and love to have any advantage over the >>> competition they can find. >>> >>> So "popular" has to be something other than unit volume. >> >> I wrote "increasingly popular", because it is becoming >> increasingly popular. That means both that more and more people >> are using EtherCAT devices, more and more EtherCAT devices are >> being installed, more and more EtherCAT devices are being >> developed, more and more EtherCAT MCUs, standand-alone peripherals, >> and FPGA cores have become available in recent years. >> >> In the big picture of MCU sales, EtherCAT usage is tiny. /Really/ >> tiny. Less tiny than five years ago, but still tiny. Making an >> EtherCAT peripheral in a MCU is not an insignificant investment for >> an MCU company - it would be a very big investment. They won't do >> that until they foresee a sizeable market - far greater than the >> automation market. Until then, it will be left to the few who are >> heavily involved in this sort of thing, such as Infineon (Siemens >> has always been a big player in the automation world). > > So your use of "more and more" is not relevant to the MCU market > which is what I've been talking about.
I am not sure how I could have been clearer.
> > I don't know anything about the automation market, so I have to > assume it is not so large if the MCU makers are ignoring a peripheral > that is used "more and more" in that market for some value of "more > and more".
Relatively speaking, it is not a big market - numbers are a lot smaller than automotive or consumer markets. And it is quite a conservative market, with people using the same devices for decades. (This also means manufacturers have to commit to very long product lifetimes in this branch.) I also did not say, or imply, that MCU makers are ignoring this peripheral. I said they don't make many devices that support it - and I said that the number of devices supporting EtherCAT has been increasing in recent years. I am sure the big MCU makers are following EtherCAT closely, and I am sure they have devices under development. When Ford, or Toyota, or Volkswagen tells NXP and Texas Instruments that they are interested in small microcontrollers with EtherCAT slave devices, the MCU makers are /not/ going to say "EtherCAT? What's that?". They are going to say "We've got some ideas under development. What combination of cpu, memories and peripherals do you want? We'll put the bricks together and do some samples". For all I know, some of these companies already have devices for their big customers - these can be made years before mere mortals get to hear about them.
> > I know designing a CPU chip is costly, but the cost depends greatly > on the process used. The CPUs in a cell phone cost millions just for > the mask set. CPUs on the 150 nm node with 256 kB of flash, not so > much. What level of CPU is married to a EtherCAT interface in the > designs you see? I was thinking a CM4 would be appropriate. >
A Cortex-M4 would be fine for simpler EtherCAT slaves. It's possible to use them with even smaller devices (or no microcontroller at all - EtherCAT slave peripherals usually support a "remote digital I/O" mode). But based on the size and speed needed for a EtherCAT module it would be silly /not/ to have something like an M4. (We are using an M7 chip with them.)
> >>>> It's the stuff that runs factories, and programmed and set up >>>> by automation engineers that are a kind of cross between >>>> electricians and software developers. Characteristics of >>>> electronics in this field are that they are often quite >>>> expensive, but designed to fit together and "just work" even >>>> when made by different companies. Most of the stuff is made by >>>> relatively few large companies, rather than small companies. >>>> Implementing many of the protocols involved are quite horrible >>>> - badly specified (with large fees to be paid before you can >>>> even see the documents), overly complex, and typically require >>>> complicated XML-based "descriptors" that make USB descriptors >>>> look simple. But while that stuff makes them unpleasant to >>>> implement, it makes them very easy to use for the people >>>> actually making the automation setups. >>>> >>>> EtherCAT is also quite complicated, but a lot of it is handled >>>> by dedicated slave controller chips and software stacks that >>>> are available. >>> >>> So what sort of price premium are these peripheral chips adding >>> to the BoM? >>> >> >> I don't deal with prices at that level, but Digikey puts them at >> about $10. > > That's pretty significant compared to a $5 XMOS or a $3 MCU chip. >
Yes. But development costs, development time, development resources are all important too. BOM prices are rarely irrelevant, but not always the most important factor. Also, those chips do a good deal more than we can get from a tiny XMOS - we'd need a much bigger XMOS and external PHY's. Maybe XMOS with EtherCAT modules would be a BOM cost win, maybe not.
>> No, I have /not/ been suggesting EtherCAT would be a killer app >> for XMOS. You seem to have combined various posts, adding 2 plus 3 >> to get 17. > > Ok, whatever. I asked a question.
You did - but that question showed that you badly misunderstood other things I wrote. Perhaps the quantity of posts here, and their lengths, has simply got out of hand. It becomes impossible to track everything that is said. I know that I have to snip and skimp on posts.
On 21/04/2020 19:29, upsidedown@downunder.com wrote:
> On Sat, 18 Apr 2020 15:06:53 +0200, David Brown > <david.brown@hesbynett.no> wrote: > >> >> Implementing an Ethernet MAC on an XMOS is pointless. Implementing an >> EtherCAT slave is not going to be much harder for the XMOS than a normal >> Ethernet MAC, but is impossible on any microcontroller without >> specialised peripherals. > > Traditional industrial protocols, like Profibus and Modbus (with all > their variants) have a quite high overhead Thus, if a slave only > wants to communicate a few bits or a single byte over the network, it > will suffer a very low transfer efficiency using standard protocols. > > With EtherCAT, it is possible to make very small nodes with only a few > bits added/removed from the frame circulating around the industrial > plants with only a few bit time additional propagation delay in each > node. So it looks good. >
Yes.
> However, EtherCAT nodes are still quite expensive and if you add/drop > only a few bits in each node doesn't make economical sense.
It does make sense in the automation world. A key point is that the bits get added or dropped where you want them dropped.
> > What is the point of using multicore processors, if a single core can > perform the basic EtherCAT node functionality. You can't cut the > multicore chip and distribute it to multiple physically separate > nodes:-).
An EtherCAT slave would take more than one virtual core on an XMOS - probably 3-6, I would guess, depending on the features you want. But if you only want a few bits of data you'd use a simple EtherCAT peripheral with digital IO, not a microcontroller at all.
> > > In addition, if there are dozens of series connected twisted pair > connectors, what is the electromechanical reliability of each > connection ? A single fault will prevent the Ethernet frame > circulating back to the master. >
EtherCAT is always logically a ring, and you can (if you want) have both ends connected back to the master. That means you can break the ring - either by accident, or while changing the live network - and things carry on as before.
> I much more prefer a dual layer approach, with CANbus (or CAN FD) up > to a few meters transferring a few bits or a byte or two around the > CAN bus and using concentrator nodes with communicate to a higher > level systems using some traditional protocols, transferring perhaps > 100 bytes in a single transaction. >
Different solutions work best in different circumstances. EtherCAT is not for everyone.
On 22/04/2020 00:00, Rick C wrote:
> On Tuesday, April 21, 2020 at 12:09:56 PM UTC-4, David Brown wrote: >> On 21/04/2020 17:26, Rick C wrote: >>> On Tuesday, April 21, 2020 at 10:20:41 AM UTC-4, Theo wrote: >>>> David Brown <david.brown@hesbynett.no> wrote: >>>>> On 17/04/2020 17:34, Theo wrote: >>>>>> I think part of the problem is the ARM licensing cost - if >>>>>> the license cost is (random number) 5% of the silicon >>>>>> sticker price that's fine when it's a $1 MCU, but when it's >>>>>> a $10000 FPGA that hurts. >>>>> >>>>> I'm not sure that's valid. First, do you know that the ARM >>>>> licensing costs work that way? >>>> >>>> I have no insight into the licensing contracts (which are >>>> likely very confidential), but what I understand is that all >>>> Stratix 10 parts have an ARM but relatively few have it >>>> enabled. Additionally I understand the licence cost is only >>>> paid for parts where it is enabled. From that I surmise that >>>> the licence cost is significant; if the cost was minimal then >>>> why have a separate SKU without the ARM? >>> >>> They do the same thing with the FPGA itself. It is not >>> inexpensive to spin the masks for FPGAs at the bleeding edge of >>> semiconductor fabrication technology. So they sell parts with >>> more or less of the part enabled or even just tested (testing >>> cost in an FPGA is not inexpensive). So you buy an FPGA with >>> 50,000 LUTs or you buy one with 25,000 LUTs and it's the same >>> part. The 50,000 part has the entire chip tested, the 25,000 LUT >>> part only tests the section with 25,000 LUTs you will be using. >>> They will get the price even lower if you are buying a large >>> quantity and you give them your design, so they only test the >>> parts of the chip your design uses! >>> >>> So don't test the CPU and don't pay the license fee. Save some >>> on the license and save more on not testing the CPU and various >>> supporting logic. >>> >>> >>>> One other possibility is that a separate SKU allows the ARM to >>>> be faulty and the part still saleable, but it seems that >>>> ballpark 80-90% of the eval boards I see are offering parts >>>> without ARMs. Which suggests there's a strong motivation not to >>>> use it. >>> >>> I'm told if a chip fails a test, it is tossed. The savings >>> comes from not testing a section to begin with. Testing >>> equipment is not cheap and FPGAs take a lot of time on the >>> beast. >> >> That would make it different from many other large, complex parts >> where disabling failed sections and even having redundant parts in >> the design increase overall yields and lower costs. But I guess it >> depends on a balance between yields, types of failure, and testing >> costs. > > If you think about it a bit you will see the only real way to have > "redundancy" in FPGAs is to excise entire sections of the chip for a > single failure. So a 50 kLUT chip will become a 25 kLUT chip if it > has a failure(s) in one half. That's all I've heard of. Trying to > replace a small section of a chip to retain the full functionality > would result in uneven delays and that's a real problem in FPGAs. >
Yes, that may well be the way to do it. (I'd guess you could split up sections a bit more than that, especially if you are willing to relax the timing specifications for routine a little.) But even with the suggested half-disabling, it could be worth it if your yields are low. Suppose that 30% of your 50 kLUT chip have a fault - that means 70% can be sold. 70% of the remaining ones - 20% of the die - can then be sold as 25 kLUT devices. These are "free". All big IC designs are made with a view to minimising the waste due to production faults, because faults are not uncommon with big chips that push the limits for production. Multi-core CPUs are regularly made with more cores, and sold as fewer core parts where faulty cores are disabled. The same applies to memory of all types. And I know that Altera certainly used to have an option to buy pre-programmed devices to fit your design - these were cheaper because they could use dies that had faults which did not affect your particular design.
On 2020-04-22, upsidedown@downunder.com <upsidedown@downunder.com> wrote:
> >>> What is the point of using multicore processors, if a single core >>> can perform the basic EtherCAT node functionality. >> >>What if you also want to run a web server and some other heavy-duty, >>encrypted, protocols under Linux in your EtherCAT slave? > > Would one really want to have a large number of such stations all > around a plant, each exchanging only a few at bits?
What makes you think the multi-core EtherCAT slave is exchanging only a few bits. The ones with multi-core processors are typically I/O hubs that can handle many hundreds of bits. You asked what's the point of using a multi-core processor in an EtherCAT slave. I told you the reason why people design them that way: because they need the CPU power to handle other protocols simultaneously or do things like image processing.
> Use some hierarchical system, but the expected advantage of EtherCAT > is lost.
Generally, the multi-core EtherCAT slave _is_ part of a hierarchical system. For example the EtherCAT slave might be an IO-Link master with 8 attached IO-Link sensors, each of which can handle 32 bytes of input and 32 bytes of output. You seem to be arguing against using a multi-core processor in an EtherCAT slave does nothing other than handle a few bits of DIO. Nobody does that. Nobody is proposing that.
>>> In addition, if there are dozens of series connected twisted pair >>> connectors, what is the electromechanical reliability of each >>> connection ? A single fault will prevent the Ethernet frame >>> circulating back to the master. >> >>If single point of failure is an issue, then you can connect the >>EtherCAT devices in a loop to get some redundancy. > > The EtherCAT has the same reliability issues as 10Base2 and 10Base5 > coaxial Ethernets with a large number of connections to a single > bus.
You were worried the entire network was susceptible to single-point connector failure. With a ring, it's not, you'll need a two-point failure to loose comms.