# 参考文献

## 1

```
Ameer M.S. Abdelhadi and Guy G.F. Lemieux. Modular multi-ported SRAM-based memories.
In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA),
pages 35{44. ACM, 2014. ISBN 978-1-4503-2671-1. doi: 10.1145/2554688.2554773. URL
http://doi.acm.org/10.1145/2554688.2554773.
```

## 2

```
SystemC. Accellera, 2.3.2 edition, October 2017. URL http://www.accellera.org/
downloads/standards/systemc.
```

## 3

```
Hassan M. Ahmed, Jean-Marc Delosme, and Martin Morf. Highly concurrent computing
structures for matrix arithmetic and signal processing. IEEE Computer, 15(1):65{82, 1982.s
```

## 4

```
Raymond J. Andraka. Building a high performance bit-serial processor in an FPGA. In
Proceedings of Design SuperCon, volume 96, pages 1{5, 1996.
```

## 5

```
Oriol Arcas-Abella et al. An empirical evaluation of high-level synthesis languages and tools
for database acceleration. In Proceedings of the International Field Programmable Logic and
Applications Conference (FPL), pages 1{8. IEEE, 2014.
```

## 6

```
AMBA AXI and ACE Protocol Specication. ARM Limited, v1.0 edition, 2013. URL http://
infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0022e/index.html. ARM
IHI 0022E.
```

## 7

```
Bryce E. Bayer. Color imaging array, July 1976. US 3971065.
```

## 8

```
Samuel Bayliss and George A. Constantinides. Optimizing SDRAM bandwidth for cus-
tom FPGA loop accelerators. In Proceedings of the International Symposium on Field Pro-
grammable Gate Arrays (FPGA), pages 195{204. ACM, February 2012.
```

## 9

```
Marcus Bednara et al. Tradeo analysis and architecture design of a hybrid hardware/soft-
ware sorter. In Proceedings of the International Conference on Application-specic Systems,
Architectures and Processors (ASAP), pages 299{308. IEEE, 2000.
```

## 10

```
Vaughn Betz and Jonathan Rose. VPR: A new packing, placement and routing tool for
FPGA research. In Proceedings of the International Field Programmable Logic and Applications
Conference (FPL), pages 213{222. Springer, 1997.
```

## 11

```
Guy E. Blelloch. Prex sums and their applications. Technical Report CMU-CS-90-190, School
of Computer Science, Carnegie Mellon University, November 1990.
```

## 12

```
Stephen Brown and Jonathan Rose. FPGA and CPLD architectures: A tutorial. IEEE Design
and Test of Computers, 13(2):42{57, 1996.
```

## 13

```
Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Jason H
Anderson, Stephen Brown, and Tomasz Czajkowski. LegUp: high-level synthesis for FPGA-
based processor/accelerator systems. In Proceedings of the International Symposium on Field
Programmable Gate Arrays (FPGA), pages 33{36. ACM, 2011.
```

## 14

```
Jianwen Chen, Jason Cong, Ming Yan, and Yi Zou. FPGA-accelerated 3D reconstruction using
compressive sensing. In Proceedings of the International Symposium on Field Programmable
Gate Arrays (FPGA), pages 163{166. ACM, 2012.
```

## 15

```
Jason Cong, Bin Liu, Stephen Neuendorer, Juanjo Noguera, Kees Vissers, and Zhiru Zhang.
High-level synthesis for fpgas: From prototyping to deployment. IEEE Transactions on
Computer-aided Design of Integrated Circuits and Systems (TCAD), 30(4):473{491, 2011.
```

## 16

```
James W. Cooley and John W. Tukey. An algorithm for the machine calculation of complex
fourier series. Mathematics of Computation, 19(90):297{301, 1965. ISSN 00255718. URL
http://www.jstor.org/stable/2003354.
```

## 17

```
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliord Stein. Introduction to
Algorithms, Third Edition. MIT Press, 3rd edition, 2009. ISBN 0262033844, 9780262033848.
```

## 18

```
Philippe Coussy and Adam Morawiec. High-level synthesis, volume 1. Springer, 2010.
```

## 19

```
Steve Dai, Ritchie Zhao, Gai Liu, Shreesha Srinath, Udit Gupta, Christopher Batten, and
Zhiru Zhang. Dynamic hazard resolution for pipelining irregular loops in high-level synthesis.
In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA),
pages 189{194, 2017. ISBN 978-1-4503-4354-1. doi: 10.1145/3020078.3021754. URL http:
//doi.acm.org/10.1145/3020078.3021754.
```

## 20

```
Jerey Dean and Sanjay Ghemawat. MapReduce: Simplied data processing on large clusters.
Communications of the ACM, 51(1):107{113, January 2008. ISSN 0001-0782. doi: 10.1145/
1327452.1327492. URL http://doi.acm.org/10.1145/1327452.1327492.
```

## 21

```
Alvin M. Despain. Fourier transform computers using CORDIC iterations. IEEE Transactions
on Computers, 100(10):993{1001, 1974.
```

## 22

```
J. Detrey and F. de Dinechin. Floating-point trigonometric functions for FPGAs. In Proceed-
ings of the International Field Programmable Logic and Applications Conference (FPL), pages
29{34, August 2007. doi: 10.1109/FPL.2007.4380621.
```

## 23

```
L Peter Deutsch. DEFLATE compressed data format specication version 1.3. 1996.
```

## 24

```
Jean Duprat and Jean-Michel Muller. The CORDIC algorithm: new results for fast VLSI
implementation. IEEE Transactions on Computers, 42(2):168{178, 1993.
```

## 25

```
Brian P. Flannery, William H. Press, Saul A. Teukolsky, and William Vetterling. Numerical
recipes in C. Press Syndicate of the University of Cambridge, New York, 1992.
```

## 26

```
Daniel D. Gajski, Nikil D. Dutt, Allen C.H. Wu, and Steve Y.L. Lin. High-Level Synthesis:
Introduction to Chip and System Design. Springer Science & Business Media, 2012.
```

## 27

```
W Morven Gentleman and Gordon Sande. Fast fourier transforms: for fun and prot. In
Proceedings of the November 7-10, 1966, fall joint computer conference, pages 563{578. ACM,
1966.
```

## 28

```
Nivia George, HyoukJoong Lee, David Novo, Tiark Rompf, Kevin J. Brown, Arvind K. Su-
jeeth, Martin Odersky, Kunle Olukotun, and Paolo Ienne. Hardware system synthesis from
domain-specic languages. In Proceedings of the International Field Programmable Logic and
Applications Conference (FPL), pages 1{8. IEEE, 2014.
```

## 29

```
Sumit Gupta, Rajesh Gupta, Nikil Dutt, and Alexandru Nicolau. SPARK: A Parallelizing
Approach to the High-level Synthesis of Digital Circuits. Kluwer, 2004. ISBN 1-4020-7837-4.
```

## 30

```
Scott Hauck and Andre DeHon. Recongurable computing: the theory and practice of FPGA-
based computation, volume 1. Morgan Kaufmann, 2010.
```

## 31

```
James Hegarty, Ross Daly, Zachary DeVito, Jonathan Ragan-Kelley, Mark Horowitz, and
Pat Hanrahan. Rigel: Flexible multi-rate image processing hardware. ACM Trans. Graph.,
35(4):85:1{85:11, July 2016. ISSN 0730-0301. doi: 10.1145/2897824.2925892. URL http:
//doi.acm.org/10.1145/2897824.2925892.
```

## 32

```
M. Heideman, D. Johnson, and C. Burrus. Gauss and the history of the fast fourier transform.
ASSP Magazine, IEEE, 1(4):14{21, October 1984. ISSN 0740-7467. doi: 10.1109/MASSP.
1984.1162257.
```

## 33

```
David A. Human. A method for the construction of minimum-redundancy codes. Proceedings
of the IRE, 40(9):1098{1101, 1952.
```

## 34

```
Ryan Kastner, Anup Hosangadi, and Farzan Fallah. Arithmetic optimization techniques for
hardware and software design. Cambridge University Press, 2010.
```

## 35

```
David Knapp. Behavioral Synthesis: Digital System Design using the Synopsys Behavioral
Compiler. Prentice-Hall, 1996. ISBN 0-13-569252-0.
```

## 36

```
Donald Ervin Knuth. The art of computer programming: sorting and searching, volume 3.
Pearson Education, 1998.
```

## 37

```
Charles Eric Laforest, Zimo Li, Tristan O'rourke, Ming G. Liu, and J. Gregory Stean.
Composing multi-ported memories on FPGAs. ACM Transactions on Recongurable Tech-
nology and Systems (TRETS), 7(3):16:1{16:23, September 2014. ISSN 1936-7406. doi:
10.1145/2629629. URL http://doi.acm.org/10.1145/2629629.
```

## 38

```
Glen G. Langdon Jr, Joan L. Mitchell, William B. Pennebaker, and Jorma J. Rissanen. Arith-
metic coding encoder and decoder system, February 27 1990. US Patent 4,905,297.
```

## 39

```
Dajung Lee, Janarbek Matai, Brad Weals, and Ryan Kastner. High throughput channel
tracking for JTRS wireless channel emulation. In Proceedings of the International Field Pro-
grammable Logic and Applications Conference (FPL). IEEE, 2014.
```

## 40

```
Edward A. Lee and David G. Messerschmitt. Pipeline interleaved programmable DSPs: Ar-
chitecture. IEEE Transactions on Acoustics, Speech, and Signal Processing (TASSP), 35(9):
1320{1333, September 1987.
```

## 41

```
Edward A. Lee and Pravin Varaiya. Structure and Interpretation of Signals and Systems,
Second Edition. 2011. ISBN 0578077191. URL LeeVaraiya.org.
```

## 42

```
Edward Ashford Lee. Plato and the Nerd: The Creative Partnership of Humans and Technol-
ogy. MIT Press, 2017. ISBN 978-0262036481.
```

## 43

```
C. Leiserson, F. Rose, and J. Saxe. Optimizing synchronous circuitry by retiming. In Third
Caltech Conference On VLSI, 1993.
```

## 44

```
G. Liu, M. Tan, S. Dai, R. Zhao, and Z. Zhang. Architecture and synthesis for area-ecient
pipelining of irregular loop nests. IEEE Transactions on Computer-aided Design of Integrated
Circuits and Systems (TCAD), 36(11):1817{1830, November 2017. ISSN 0278-0070. doi:
10.1109/TCAD.2017.2664067.
```

## 45

```
Rui Marcelino et al. Sorting units for FPGA-based embedded systems. In Distributed Embedded
Systems: Design, Middleware and Resources, pages 11{22. Springer, 2008.
```

## 46

```
Janarbek Matai, Pingfan Meng, Lingjuan Wu, Brad T Weals, and Ryan Kastner. Designing a
hardware in the loop wireless digital channel emulator for software dened radio. In Proceedings
of the International Conference on Field-Programmable Technology (FPT). IEEE, 2012.
```

## 47

```
Janarbek Matai, Joo-Young Kim, and Ryan Kastner. Energy ecient canonical human
encoding. In Proceedings of the International Conference on Application-specic Systems,
Architectures and Processors (ASAP). IEEE, 2014.
```

## 48

```
Janarbek Matai, Dustin Richmond, Dajung Lee, and Ryan Kastner. Enabling FPGAs for the
masses. arXiv preprint arXiv:1408.5870, 2014.
```

## 49

```
Janarbek Matai, Dustin Richmond, Dajung Lee, Zac Blair, Qiongzhi Wu, Amin Abazari, and
Ryan Kastner. Resolve: Generation of high-performance sorting architectures from high-level
synthesis. In Proceedings of the International Symposium on Field Programmable Gate Arrays
(FPGA), pages 195{204. ACM, 2016. ISBN 978-1-4503-3856-1. doi: 10.1145/2847263.2847268.
URL http://doi.acm.org/10.1145/2847263.2847268.
```

## 50

```
Carver Mead and Lynn Conway. Introduction to VLSI systems, volume 1080. Addison-Wesley
Reading, MA, 1980.
```

## 51

```
Giovanni De Micheli. Synthesis and optimization of digital circuits. McGraw-Hill Higher
Education, 1994.
```

## 52

```
Shahnam Mirzaei, Anup Hosangadi, and Ryan Kastner. FPGA implementation of high speed
r lters using add and shift method. In Computer Design, 2006. ICCD 2006. International
Conference on, pages 308{313. IEEE, 2007.
```

## 53

```
MISRA. Guidelines for the Use of the C Language in Critical Systems. March 2013. ISBN
978-1-906400-10-1. URL https://www.misra.org.uk.
```

## 54

```
Rene Mueller et al. Sorting networks on FPGAs. The VLDB Journal|The International
Journal on Very Large Data Bases, 21(1):1{23, 2012.
```

## 55

```
Jorge Ortiz et al. A streaming high-throughput linear sorter system with contention buering.
International Journal of Recongurable Computing, 2011.
```

## 56

```
Marios C. Papaefthymiou. Understanding retiming through maximum average-weight cycles.
In SPAA '91: Proceedings of the third annual ACM symposium on Parallel algorithms and
architectures, pages 338{348, 1991.
```

## 57

```
William B. Pennebaker. JPEG: Still image data compression standard. Springer, 1992.
```

## 58

```
Robert Sedgewick. Algorithms in C. Addison-Wesley, 2001. ISBN 978-0201756081.
```

## 59

```
Bhaskar Sherigar and Valmiki K. Ramanujan. Human decoder used for decoding both ad-
vanced audio coding (AAC) and MP3 audio, June 29 2004. US Patent App. 10/880,695.
```

## 60

```
F. Winterstein, S. Bayliss, and G. A. Constantinides. High-level synthesis of dynamic data
structures: A case study using Vivado HLS. In Proceedings of the International Symposium
on Field Programmable Gate Arrays (FPGA), pages 362{365, December 2013. doi: 10.1109/
FPT.2013.6718388.
```

## 61

```
Ian H. Witten, Radford M. Neal, and John G. Cleary. Arithmetic coding for data compression.
Communications of the ACM, 30(6):520{540, 1987.
```

## 62

```
UltraScale Architecture Congurable Logic Block (UG574). Xilinx, v1.5 edition,
February 2017. URL https://www.xilinx.com/support/documentation/user_guides/
ug574-ultrascale-clb.pdf.
```

## 63

```
Vivado Design Suite User Guide: High-Level Synthesis (UG902). Xilinx, v2017.1 edition, April
2017. URL https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_
1/ug902-vivado-high-level-synthesis.pdf.
```

## 64

```
UltraScale Architecture Conguration (UG570). Xilinx, v1.7 edition, March
2017. URL https://www.xilinx.com/support/documentation/user_guides/
ug570-ultrascale-configuration.pdf.
```
